What is this?
A browser-based Short-Time Fourier Transform (STFT) spectrogram viewer.
It decodes any audio file (or live microphone recording) into a 2-D time–frequency
heatmap, coloured with the Inferno palette: dark blue = silence, yellow/white = peak
energy. All processing happens locally in your browser — no server, no upload.
Quick start
- Load audio — click Browse… to pick a local audio file
(MP3, WAV, OGG, M4A, …), or press ● Record to capture
microphone audio directly. Stop recording with ■ Stop.
Optionally press ↓ Save Recording to download the captured
audio as a WAV file.
- Adjust parameters (optional) — tweak Frame Size, Hop Size, Window,
and the time range in the PARAMETERS panel.
- Run — press ▶ Run. The spectrogram appears in
the main panel.
- Explore — scroll the spectrogram horizontally, zoom in by dragging
a rectangle, and double-click (or press ⊞ Reset) to zoom back out.
- Play — press ▶ Play to listen while a red cursor
line tracks position in real time.
- Re-run — change any parameter (including the comparison mode)
and press ▶ Run again. Recorded audio is kept in memory, so
no re-recording is needed.
Parameters explained
| Parameter | What it controls |
Sample Rate (Hz) |
Target analysis sample rate. Audio is resampled to this rate if it differs
from the file's native rate. Higher = more high-frequency detail. |
Frame Size |
FFT window length in samples (must be a power of 2: 128, 256, 512 …).
Larger = finer frequency resolution but coarser time resolution. |
Hop Size |
Step between successive frames in samples. Smaller than Frame Size gives
overlapping frames and a smoother, denser spectrogram along the time axis. |
Window |
Windowing function applied to each frame before the FFT, controlling
spectral leakage vs. resolution trade-off:
- Rectangular — no windowing; sharpest frequency resolution but
highest sidelobe leakage.
- Hann — smooth roll-off; good general-purpose choice for speech
and music.
- Hamming (default) — slightly better sidelobe rejection than Hann;
standard in speech processing.
- Blackman — lowest sidelobe levels; best for detecting weak
components next to strong ones, at the cost of wider main lobe.
|
Start Time (s) |
Offset into the audio file at which analysis begins. |
Segment Duration (s) |
Length of audio to analyse (max 10 s per run). |
Time Scale (px/s) |
Horizontal zoom slider — pixels per second. Drag right to stretch the
time axis; the spectrogram rescales immediately without pressing Run.
The current value is shown next to the label (range: 10 – 1500 px/s,
step 10). |
Input field limits
| Field | Constraint | Notes |
Sample Rate (Hz) |
3 000 – 768 000 Hz |
Enforced by the Web Audio API's OfflineAudioContext. Values
outside this range cause a browser error during resampling. |
Frame Size |
Positive power of 2 |
Required by the FFT algorithm (fft.js). Use 128, 256, 512, 1024, etc.
Non-powers-of-2 are rejected with an error message. |
Hop Size |
Positive integer ≥ 1 |
Must be less than or equal to Frame Size to avoid skipping samples.
No upper bound is enforced, but values larger than Frame Size waste
data. |
Start Time (s) |
0 ≤ start < audio duration |
Must be within the loaded audio. Values outside this range are
rejected with an error. |
Segment Duration (s) |
> 0 s, ≤ remaining audio |
If the value exceeds the audio remaining from Start Time, it is
silently clamped to the available length and the field is updated. |
Time Scale (px/s) |
10 – 1 500 px/s |
Controlled by a range slider; bounds are enforced by the
<input type="range"> element itself. |
Spectrogram display
- X-axis — time in seconds from the chosen Start Time.
- Y-axis — frequency in Hz (0 Hz at bottom, Nyquist at top).
- Colour — power in dB, normalised so the loudest bin = 0 dB.
The floor is −80 dB. Colours follow the Inferno palette
(dark blue → purple → orange → yellow).
- Colorbar — scale reference on the right edge of the panel.
Zoom & scroll
- Drag on the spectrogram to draw a zoom rectangle
(crosshair cursor).
- Double-click the spectrogram or press ⊞ Reset
to return to the full view.
- The ⇔ Scroll / 🔍 Zoom button (top-left of
the spectrogram, visible after Run) toggles between pan mode (drag
to scroll along the time axis) and zoom mode (drag to draw a zoom
rectangle). Particularly useful on touch screens, where a drag gesture would
otherwise conflict with the browser's native scroll.
Window function panel
The WINDOW FUNCTION section at the bottom of the sidebar shows a
live plot of the selected window's shape, updated whenever the Frame Size or
Window type changes.
- In Single mode the panel renders one curve with its mathematical
equation typeset in LaTeX below the canvas.
- In Side-by-Side mode all four windows are overlaid on the same
axes with a colour-coded legend, making amplitude differences visible at a glance
(e.g. Hamming's non-zero endpoints vs. Hann's zero-crossings).
- The Y-axis is fixed at [0, 1] — no per-window auto-scaling — so relative
heights are meaningful across window types.
Below the shape plot, a Frequency Response canvas shows the
magnitude spectrum of the window in dB, zoomed to the main lobe and first few
sidelobes. This makes spectral leakage directly readable:
- Rectangular — narrow main lobe; first sidelobe ≈ −13 dB, slow rolloff.
- Hann — wider main lobe; sidelobes ≈ −32 dB with fast rolloff.
- Hamming — similar width to Hann; sidelobes plateau near −43 dB.
- Blackman — widest main lobe; sidelobes below −74 dB.
Comparison mode
Set Mode to Side-by-Side and press ▶ Run
to display four spectrograms simultaneously — one for each window function
(Rectangular, Hann, Hamming, Blackman). Panels share the same frequency axis
and scroll in sync, so window-function effects are directly comparable.
You can switch between Single and Side-by-Side freely and re-run without
reloading audio.
Sidebar collapse
Press « (top-right of the sidebar) to collapse the control panel and
give the spectrogram more horizontal space. Click any icon to expand a specific
section, or press » to fully restore the sidebar.
License & attribution
This project is released under the
MIT License
— Copyright © 2026 Ilamparithi Murali.
It uses the following open-source libraries:
-
fft.js
v4.0.4 — Fedor Indutny — MIT License
-
KaTeX
— Khan Academy — MIT License