Real-time spectral analysis

Visualizing the
Mathematics of Sound

Harmonic decomposition, spectral analysis, and interactive audio intelligence. See sound the way physics does.

Waveform - Time DomainRMS --
Load audio or enable
microphone to begin
FFT Spectrum - Frequency DomainPeak --
X(f) = integral x(t) e^(-i2 pi ft) dt
Spectrogram - Time x FrequencyCentroid --
Rolling FFT visualization
time x frequency x intensity
Harmonic Map - Overtone SeriesHCS --
f_n = n times f0
harmonic overtone series
Fundamental
--
Detected Note
--
Spectral Centroid
--Hz
RMS Energy
--dB
Zero Cross Rate
--/f
Harmonic Complexity
--
Spectral Rolloff
--Hz

Multi-Track Spectral Comparison

Add two or more audio files
to compare their spectral fingerprints side by side

No audio loaded. Open the Analyzer, load a file or enable the microphone, then return here for live context about your specific sound.
Learn - Overview

What is Soniform measuring?

Every sound is a pressure wave. What makes a violin sound different from a trumpet is the hidden mathematical structure inside that wave. Soniform decomposes sound in real time using the Fourier Transform and displays that structure across seven measurements and four visualizations.

Each metric in the bottom bar is recomputed every animation frame from your audio. Together they form a spectral fingerprint of the sound.

The seven metrics at a glance

Fundamental

The lowest, strongest pitch frequency. What a musician calls the note.

Detected Note

Nearest musical note name (A, C-sharp, etc.) plus cents sharp or flat.

Spectral Centroid

Centre of mass of the spectrum. High value means bright. Low value means warm.

RMS Energy

Overall loudness of the signal in decibels.

Zero Crossing Rate

How many times the waveform crosses zero per frame. A proxy for tonality versus noise.

Harmonic Complexity

Score from 0 to 100 measuring how evenly energy spreads across the overtone series.

Spectral Rolloff

Frequency below which 85% of the spectrum energy lives.

Use the sidebar to explore any metric in depth. If audio is loaded, each section shows your live reading and explains what it means for the sound you are hearing right now.

Learn - Fundamental Frequency

Fundamental Frequency

When a string or air column vibrates, its base rate of oscillation is the fundamental frequency (f0), the pitch your ear identifies.

f0 = 1 / T    where T is the period of one full cycle in seconds

Soniform detects f0 using autocorrelation: it slides a copy of the waveform across itself and finds the time-lag where it matches best. That lag equals one period, giving the fundamental.

--
Your current fundamental

The dominant pitch frequency detected in real time.

Middle C on a piano is 261.6 Hz. Human speech sits 85 to 255 Hz. A piccolo can reach 4 kHz. High number means high pitch; low number means bass.

autocorrelationpitch detectionperiodicity
Learn - Detected Note

Detected Note

Western music divides the octave into 12 equal semitones. Soniform maps the detected fundamental to the nearest note using:

semitones = 12 times log2(f0 / 440)

440 Hz is A4, the international tuning reference. Cents are hundredths of a semitone. A reading of 0 cents means perfectly in tune. A reading of 50 cents means halfway between two notes.

--
Nearest musical note

Includes octave number and cents deviation from perfect tuning.

Singers and instrumentalists target 0 cents. Vibrato oscillates plus or minus 20 to 50 cents rhythmically.

equal temperamentA440cents
Learn - Spectral Centroid

Spectral Centroid

Imagine the frequency spectrum as a see-saw. Each frequency bin has a weight equal to its amplitude. The centroid is where that see-saw balances, the centre of mass of the spectrum.

centroid = Sum(fi times |X(fi)|) / Sum(|X(fi)|)

It is the single best predictor of perceived brightness. High centroid means tinny, bright, metallic. Low centroid means warm, dark, bass-heavy.

--
Spectral centre of mass

Correlates directly with how bright or dark the sound is perceived.

A soft cello might show 600 Hz. A cymbal crash might show 8000 Hz.

timbrebrightnessMIR
Learn - RMS Energy

RMS Energy

Root Mean Square measures effective signal power. It tracks perceived loudness far better than peak level.

RMS = sqrt( (1/N) Sum x[n]^2 ) converted to dB = 20 times log10(RMS)

Silence is near -90 dB. A quiet room is around -60 dB. A speaking voice sits around -30 to -15 dB. Loud music approaches 0 dB. Above 0 dB indicates clipping.

--
Signal power in decibels

Loud is near 0 dB. Silence is near -90 dB.

loudnessdecibelssignal power
Learn - Zero Crossing Rate

Zero Crossing Rate

Every time the waveform crosses the zero axis that is a zero crossing. ZCR counts how many happen per analysis frame.

ZCR = Sum |sign(x[n]) - sign(x[n-1])| / 2

A pure 440 Hz sine wave crosses roughly 41 times per 2048-sample frame. Noisy broadband sounds cross hundreds of times. Clean tonal sounds cross far fewer times.

--
Zero crossings per frame

Low means tonal or pitched. High means noisy or percussive.

noisinesstonalitytransients
Learn - Harmonic Complexity Score

Harmonic Complexity Score

A pitched sound distributes its energy across a harmonic series, the fundamental and its integer multiples. How that energy is spread tells us about timbre.

HCS = -Sum pn times log2(pn) / log2(N) times 100

This is the Shannon entropy of the overtone distribution, normalised to 100. A score of 0 means all energy is in one harmonic (pure sine wave). A score of 100 means energy is perfectly equally spread across all harmonics.

--
Overtone entropy (0 to 100)

Flute is around 20 to 35. Guitar is around 50 to 70. Full orchestra is 80 and above.

This is Soniform's original metric. Simple flutes concentrate energy at the fundamental. Violins and brass spread it across many overtones producing richer scores.

entropyovertonestimbreoriginal metric
Learn - Spectral Rolloff

Spectral Rolloff

Rolloff is the frequency below which 85% of total spectral energy lives. Think of it as a brightness threshold.

Find f* such that: Sum over f less than or equal to f* of |X(f)|^2 = 0.85 times total energy

High rolloff means significant high-frequency content such as cymbals or consonants. Low rolloff means bass or kick drum is dominant.

--
85% energy threshold

Bass-heavy sounds have low rolloff. Bright or noisy sounds have high rolloff.

brightness85th percentileenergy distribution
Learn - FFT and Spectrum

The FFT Spectrum

The Fast Fourier Transform takes a snapshot of N time-domain samples and decomposes it into constituent sine waves at every frequency from 20 Hz to the Nyquist limit.

X(f) = Sum from n=0 to N-1 of x[n] times e^(-i 2 pi f n / N)

Soniform plots magnitudes on a logarithmic frequency axis so octaves appear equally spaced, matching how human hearing works.

FFT Size

Larger FFT means more frequency bins and sharper spectral detail, but each frame covers more time so fast transients blur. Smaller FFT means faster response but coarser resolution. 2048 is the default balance.

Windowing

The FFT assumes the signal repeats periodically. A hard edge causes spectral leakage. Windowing tapers the frame to zero at both edges. Hann is a good general default. Blackman reduces leakage further. Rectangular gives maximum resolution but causes leakage artifacts.

Fourier transformfrequency resolutionspectral leakagewindowing
Learn - Spectrogram

The Spectrogram

A spectrogram is a rolling history of FFT snapshots: time runs left to right, frequency bottom to top, and colour encodes intensity. Dark navy means silence. Bright cyan or white means strong energy.

What to look for

Horizontal stripes are sustained tones. Vertical smears are transients such as claps or drum hits. Diagonal streaks indicate pitch glides or vibrato. Speech shows shifting formant bands separated by silence.

Adjust the history slider (1 to 8 seconds) in the sidebar. Short windows reveal rapid events; long windows show how the sound evolves over time.

time-frequencySTFTformantstransients
Learn - Harmonic Map

Harmonic Map

A vibrating string or tube produces not just f0 but a series of overtones at exact integer multiples, the harmonic series.

fn = n times f0    n = 1, 2, 3, 4 ...

Soniform plots this as a radial node graph. The central cyan node is f0. Surrounding violet nodes are overtones. Node size represents amplitude.

Two instruments playing the same note have the same node positions but very different sizes. A flute concentrates energy at n=1 and n=2. A violin spreads to n=6 and beyond. That difference is what makes them sound distinct.

overtonestimbreharmonic series

Welcome to Soniform

A real-time audio analysis platform that makes the hidden mathematics of sound visible. Here is everything you need to get started.

Getting Started

How to Use Soniform

1
Open the Analyzer. Click Start Analyzing on the home screen, or use the top navigation bar from any page.
2
Load your sound. Drag or click to upload an audio file (mp3, wav, flac, ogg) using the upload zone in the left sidebar, or click Microphone Input to analyse live sound from your mic.
3
Watch the visualizations respond. All four panels update in real time. The metrics bar at the bottom shows computed values every frame.
4
Adjust the analysis controls in the sidebar. FFT size, smoothing, windowing function, and gain all affect how the spectrum looks and what it reveals.
5
Compare multiple files side by side on the Compare page. Add as many tracks as you like and see their spectra, waveforms, and metrics simultaneously.
6
Visit Learn to understand what each number means, with live readings applied to whatever sound you are currently analysing.
The Four Visualizations

What You Are Seeing

Waveform

The raw air pressure over time. Wide swings mean loud sound. A flat line means silence. Shape reveals rhythm, attack, and sustain.

FFT Spectrum

The frequency fingerprint at this instant. Logarithmic x-axis from 20 Hz to 22 kHz, decibels on the y-axis. Each peak is a frequency component present in the sound.

Spectrogram

A rolling history of the spectrum. Time runs left to right, frequency runs bottom to top, colour encodes loudness. Notes appear as horizontal stripes. Transients appear as vertical smears.

Harmonic Map

The overtone series as a radial node diagram. The central node is the fundamental. Surrounding nodes are integer multiples. Node size encodes amplitude, making timbre visible.

The Pages

Navigating Soniform

Analyzer

The main tool. Load audio or enable your microphone to explore all four visualizations and seven metrics in real time. Adjust FFT settings, windowing, and gain from the sidebar.

Compare

Add multiple audio files and see their spectra, waveforms, and metrics side by side. Ideal for comparing instruments, singers, or different versions of a mix.

Learn

Deep explanations of every metric and visualization. When audio is active, Learn shows your live readings so you understand what each number means for your specific sound.

Introduction

This page. A guide to everything in Soniform including how to use it, what you are seeing, what the numbers mean, and what makes this platform different.

Sidebar Controls Explained

Analysis Settings

FFT Size (512 to 8192)

Larger means more frequency detail but slower time response. 2048 is the recommended balance. Use 8192 for fine pitch analysis. Use 512 for tracking fast transients.

Smoothing (0 to 0.99)

How much the spectrum is averaged frame to frame. High smoothing gives a calmer, easier-to-read display. Low smoothing responds instantly to rapid changes.

Window Function

Hann is the best general-purpose choice. Blackman reduces spectral leakage further. Rectangular gives maximum resolution but causes leakage artifacts.

Gain (plus or minus 20 dB)

Amplifies or attenuates the signal before analysis. Use positive gain for quiet recordings. Use negative gain if the signal is clipping.

Peak Labels

Annotates the strongest frequency peaks in the spectrum with their frequency in Hz or kHz. Toggle off for a cleaner view.

Harmonic Lines

Overlays dashed vertical lines at harmonic series positions in the spectrum. Helps you see whether peaks align with expected overtones.

Pro tip: Try playing a single sustained note on any instrument into your microphone, then open Learn and select Harmonic Map. You will see exactly which overtones your instrument produces and how the Harmonic Complexity Score reflects its timbre in real time.

Under the Hood

The Mathematics

Every frame, roughly 60 times per second, Soniform collects a block of audio samples from the Web Audio API, applies a windowing function to reduce edge artifacts, then runs the Fast Fourier Transform. The FFT output feeds all four visualizations simultaneously.

Pitch detection uses autocorrelation, finding the dominant period in the signal without relying on the FFT. The Harmonic Complexity Score is the Shannon entropy of the overtone amplitude distribution, normalised to 0 to 100. All other metrics including centroid, rolloff, ZCR, and RMS are standard DSP calculations performed on the raw waveform or frequency data.

Everything runs entirely in the browser. No server, no upload, no latency beyond your hardware. Your audio never leaves your device.

Academic context: The techniques here, including STFT, spectral moments, harmonic analysis, and autocorrelation pitch detection, are foundational to Music Information Retrieval (MIR), speech processing, acoustic engineering, and computational musicology. The Harmonic Complexity Score is Soniform's own formulation, applying information-theoretic entropy to the overtone domain.