← back to selected work resume →

Case study 02 · May 2026 — present

Audio Atlas

The catalog I built for the way I actually use audio.

Role
Designer, builder, sole engineer
Timeline
May 2026 — present
Stack
Astro · FastAPI · Python · Pedalboard · Claude
Status
Shipped · king-mix workflow live 2026-05-27

01

The pitch

Sit down to make something. The blank Ableton session waits. You half-remember a vocal you recorded on your phone last Tuesday. You open Finder. iCloud hasn't downloaded the file yet. By the time it shows up, the idea is gone.

That moment, repeated a thousand times across eight years of writing music, is the gap this catalog closes. Twenty thousand audio files (every voice memo, every jam, every bounce, every sample I've ever touched) sit in one place now. Every row plays the instant you click its waveform. You can crop a four-bar slice out of a thirty-minute jam in ten seconds. You can peel a vocal off its instrumental with one click. You can promote a loop straight into Ableton's browser as a normalized, ready-to-drag file. You can hand a finished track to a mix engineer that knows what kinds of mixes I love, A/B six takes it sends back, crown the winner, and publish it to a public sounds page without leaving the catalog.

None of that existed in the first version of this site. Three weeks ago this was a default-Squarespace-shaped portfolio that couldn't play a single audio file. What follows is the journey of one row in a catalog learning what it could do, told through real iterations and live components you can touch.

The row carries the work. Tabs come and go. What the row can do is the contract.
0
files in the catalog
0
verbs per row (was 2)
0
named buckets (auto-classified)
0
mix variants per job

02

The card grew up

Six eras. Each one is a real git commit where the row earned a new verb. Drag the slider, watch the row's capability light up underneath as it learns what it can do.

LIVE · DRAG TO SCRUB
VOICE MEMO Oct 23, 2024

VOX_Tolo_0007.wav

A short voice memo, captured between sets. Possible lyric fragment.

0:30 · VOICE-NOTE

  • voice-memo
  • idea-fragment
  • first-person

NOIZU_125_shaker_loop_tight_01.wav

Loop VLSX

NOIZU_125_shaker_loop_tight_01.wav

0:04
Loop

NOIZU_125_shaker_loop_tight_01.wav

0:04
TAGS
○ PUBLICloopsample
TECH
99 BPM · E major
LINEAGE
from sample: NOIZU pack v3

rainforest_post_karoke.als → wav

5:42 · 100 BPM · key Em · 142 bars
1:18 12 bars · 28.8s 1:48
12 bars · 28.8s · 1:18 → 1:48
STEMS ▾ async Demucs · ~30s on M1

rainforest_post_karoke (chorus 1).wav

100 BPM · Em · 0:18 · LOOP
  1. focal · forward_brightready
  2. focal · recessed_warmready
  3. space · punchy_dryrendering 47%
  4. space · spacious_wetrendering 32%
  5. dynamics · modern_compressedqueued
  6. dynamics · vintage_openqueued
FOCAL · READY TO JUDGE
A forward_bright
B recessed_warm

audio faceoff · cross-axis memory

Card verbs
  • Queue
  • Listen →
  • Waveform
  • ▶ Audition
  • Click waveform = play
  • Bucket J/L/F/T
  • Expand panel
  • Inline rename
  • Hold ✕ to archive
  • ✂ Crop loop
  • Split stems
  • Spectrogram
  • → Promote to Live
  • → Send to Engineer

E1 · 2026-05-15 · FIRST PORTAL

Filename + metadata + a separate Listen page

New this era: the row exists. Queue for later, Listen → opens a dedicated Listen page. Nothing happens inside the card itself.

The portal had five tabs: Surprise, Library, Workbench, Projects, Listen. The card was a pointer, not a tool. Lots of pages, almost no functionality per row.

d79f9b2 · publish: portal v1 with Card.astro

One markup, six design eras, swapped via a single class. The chip strip is the actual capability arc of the production card across three weeks. Watch E5 morph into a crop surface. Watch E6 open a pipeline beneath itself.

Before this, the row was a filename in Finder. You couldn't hear it. You couldn't see what was in it. You couldn't do anything to it without leaving the catalog and opening another app. Each era above earned the row one new thing it could do for itself.

The pattern that emerged: every time the row could do more by itself, a whole UI surface could be deleted. The dedicated Listen page went away once the waveform played. The Library tab went away once Buckets could filter and sort. Workbench went away once cropping and stem-splitting moved into the card's expand panel. Tabs got added back later for the pipelines downstream of the row (Engineer, Sounds), but the row itself stayed the contract underneath.

03

Touch the current state

The production BucketCard with the full action surface. Click the waveform to hear it loop (real audio). Click ✂ Crop to slice. Click 2-stem or 4-stem to split. Click Send to Engineer to queue a mix job. Nothing actually deletes, splits, or sends, but every affordance is wired.

LIVE · TRY EVERY VERB

0:04

Tags
music-only splice loop sample drum
Tech
99 BPM · E major
Actions
The actual production markup. The waveform plays a procedurally-synthesised drum loop at the card's own BPM. ✂ Crop opens an inline crop frame on the waveform. 2/4-stem flips through a fake Demucs render and reveals the resulting child rows. Send to Engineer queues a fake job for six A/B variants. Two variants below show the same component across other catalog states.

0:30

0:08

Tags
loop loop sample
Tech
55 BPM · C major
Actions
✓ In Live

This is the row that used to be a filename. The audio plays inside it. The metadata edits inside it. The tags you used to manage in different surfaces live in the expand panel. The row stopped being a label on a file and became the place where you work with the file.

04

The catalog files itself

Your sample library is twelve thousand files across forty-seven sub-folders. You know there's a perfect snare somewhere in there. You give up and use a stock one. That's the problem this layer is for.

The catalog organizes itself in two passes. Pass one is a Browse strip at the top of every Buckets page: All · Jam · Loops · Field · Tracks · Inbox. An auto-classifier reads each row's content type, source, and duration, then writes its bucket. Jams (long-form musical captures) and Field rows (voice memos plus audio extracted from videos) land in their named buckets automatically. Loops and Tracks stay curator-picked. Inbox is the residual that hasn't been auto-classified yet or that I haven't gotten to. The first catalog run auto-bucketed 1,389 field rows and 329 jam rows without me touching anything.

Pass two is the tag chip strip below it. The chips are frequency-ranked tags in the current bucket+category view. Click one and the row count narrows. The counts beside each chip update to reflect what's actually left. The system never asks you to read a documentation page about how filtering works. The chips show what's there and what filtering by it would do.

LIVE · CLICK A TAG
TAGS
3,458 of 3,458 items
  • NOIZU_125_shaker_loop_tight_01.wav 0:04 99 BPM · E · loop
  • driving_fast_so_i_dont_drown.wav 1:48 87 BPM · A · voice-note
  • rainforest_post_karoke.wav 5:42 100 BPM · Em · jam
  • 55_Cmaj_0_what_you_know.wav 0:08 55 BPM · C · loop
  • VOX_Tolo_0007.wav 0:30 · · voice-note
  • monday_morning_thanks.wav 2:14 78 BPM · G · voice-note
  • OLIVER_80_perc_loop.wav 0:06 80 BPM · C · loop
  • if_i_loved_you_better.wav 3:22 92 BPM · D · voice-note
The chip strip below sits under the Browse strip on the production page. Click 'songwriting', the strip re-ranks, the row count drops. Add 'first-person' to compound it. The view becomes the index. Tags come from Whisper transcripts that Claude classified, plus tags I typed into card expand panels. The catalog is a mirror of my own listening.

05

The files start speaking

You record ninety minutes of jamming. The file lands on your desktop as Audio 14.wav. A week later you can't tell which of the seventeen Audio NN.wav files on your desktop has the part you actually liked. Whisper transcribes. Claude summarizes. Now the row tells you what's in it.

LIVE · 3 ROWS, 3 SUMMARIES

rainforest_post_karoke.als

47:08 · JAM
TRANSCRIPT · WHISPER

ok, ok so the bass is doing the thing where it loops a fifth and a sub-octave… yeah that’s good. that’s good. who has the kick. is anyone on kick? someone’s on kick. iguanas are NOT a chord progression, Marcus.

SUMMARY · CLAUDE

Five musicians; none of them know what reverb is for the first 22 minutes. The bassist locks into a groove around minute 31; everyone gets it together for ~90 seconds; then someone mentions iguanas and they never recover. Two genuinely good 16-bar sections worth pulling.

  • loop-candidate · 31:14 → 31:46
  • loop-candidate · 38:02 → 38:30

driving_fast_so_i_dont_drown.wav

1:48 · VOICE-NOTE
TRANSCRIPT · WHISPER

wait wait wait wait wait. maybe if I just stop pretending. that’s it. that’s the— no the chorus is doing too much work. the chorus is— you know what no the verse. the verse is the chorus. shit. semi truck. shit.

SUMMARY · CLAUDE

First-person, anxious, two minutes of stream-of-consciousness while passing a semi truck. Roughly 12 distinct ideas, each one abandoned for the next within four seconds. The phrase “maybe if I just stop pretending” surfaces twice. Songwriting flag: high.

  • lyric-fragment · 0:38
  • lyric-fragment · 1:21
  • songwriting

VOX_Tolo_0007.wav

0:30 · VOICE-NOTE
TRANSCRIPT · WHISPER

ok so. what if the chorus, like, doesn’t come back. like. you write it, you sing it once. and then you just… leave. you leave it hanging. is that— is that a thing.

SUMMARY · CLAUDE

30 seconds, conspiratorial tone. A small theory about choruses: maybe they’re supposed to leave you hanging. Worth tagging “structure-experiment” — there are six other memos in the catalog testing variations of this idea.

  • idea-fragment
  • structure-experiment
  • first-person
Three real catalog rows, three real-vibe Claude outputs. The point isn't classification. It's pulling humanity back out of the silos. Multi-hour rambling voice memos become one-sentence character studies. A jam session becomes a logline. The files keep all their audio, but they also keep words about themselves now.

This is the part of the build I didn't know I needed until it started returning summaries. Eight years of audio I'd never gone back to became eight years of rows I'd actually read. A jam session I forgot existed surfaced because Claude noticed two 16-bar sections worth pulling. A voice memo got flagged because the same conspiratorial chorus theory had shown up six times across the archive.

A hallucination sweep runs over the whole catalog too. Whisper sometimes spirals on long files with music sections (the model keeps writing "thank you thank you thank you" while ambient guitar plays). The sweep flags rows with low unique-word ratios or repeating top-4-grams and relabels them as music-only, so the catalog stops claiming a 17-second clip is a profound speech about gratitude. The first pass relabeled 91 of 100 hallucination-shape rows. The engine isn't just classifying. It's noticing its own mistakes.

06

Rip a loop out of a thirty-minute jam

Pull a 12-bar chunk out of a 5:42 jam in ten seconds. Drag the cyan frame to slide. Drag a handle to retrim. Spacebar plays only what's between the caps.

Before this existed, the same task meant opening the jam in Ableton, eyeballing the warp markers, dragging a loop region, bouncing the selection, renaming the bounce, dragging the new file into Live's browser. Five minutes of context-switching per loop, every loop, forever. Most loops never got pulled out at all because the cost of pulling one out was so high.

LIVE · DRAG · ⌘ SNAP · SPACE TO LOOP
rainforest_post_karoke.wav 5:42 · 100 BPM · 142 bars · key Em
13 bars · 31.2s
13 bars · 31.2s · 0:25 → 1:48

space play / pause  ·  scrub ±1s in the loop  ·  toggle bar snap  ·  drag the frame body to slide  ·  drag a handle to retrim

Bar-snap on means the caps grab the nearest bar boundary (so the loop will actually loop). Bar-snap off lets you trim by milliseconds. Export drops a normalized .wav into _Soundbending/loops/extracted/ as a child row of the source jam (lineage chip and all). When the crop covers half the source or more, the engine offers to promote the crop and trash the source so the catalog keeps the canonical take.

Now it lives inside the catalog row, where the audio already is. The crop tool only matters because the catalog already knew the tempo. Librosa runs on every audio row at ingest. When BPM and key are populated, every loop tool downstream gets to assume them. The cropper draws bar ticks. The promoter writes BPM and key into the filename. The mix engineer's renderer crossfades at section boundaries because the analyzer already found them. One analysis pass at the catalog level lights up every tool downstream.

07

The card writes into Ableton

You finish a rough mix. You bounce it. You name it v1. Two days later you bounce a different version, also v1. Which one had the kick where you wanted it? Before the Promote button, every loop and bounce I made was the start of an organizational mistake.

The Promote button settles all of that. One click writes the loop into Live's own browser, normalized to -14 LUFS, named with its BPM and key, ready to drag onto a clip slot. The instinct when extending a DAW is to build a new surface inside it. The right move was the opposite. Ableton already drags audio onto clip slots from any folder you register as a Place, so the engine just writes into a Place I registered once. The DAW's own gesture does the work. No plugin, no Max device, no custom audio API.

LIVE · CLICK PROMOTE

55_Cmaj_0_what_you_know.wav

LOOPS
TECH 55BPM · Cmajor · 0:08
LIVE

Three steps. Sub-second per loop.

The promote pipeline runs synchronously on the M1 the moment you click. Each step writes real state. Re-promoting is idempotent — the file overwrites in place, the LUFS pass re-runs, the column gets the same path.

First three loops landed in _Soundbending/loops/ on 2026-05-17 evening. The Ableton Browser Place was set up once. After that, every promoted loop appears in Live's own sidebar.

The button is real. The pipeline is real. The animation timing approximates the actual sub-second-per-loop behavior of the M1 (the loudnorm pass is the slowest of the three steps). Re-running is idempotent. The file overwrites in place.

Earlier in the build I tried the opposite approach, a Max-for-Live device that rendered the catalog inside Ableton itself. Two architectural walls killed it. The pivot to writing into the affordance instead of building inside it became the spine of the whole build, not just this feature.

08

An engineer that learns my taste

You finish a track. You want to A/B against a reference, a Tame Impala cut maybe. You jump out of Ableton, open Spotify, miss the timing, forget what you were comparing, give up. A/B is the most honest mixing tool there is, and it's the one DAWs make hardest.

The mix engineer is the second pipeline downstream of the row. Where Promote-to-Live ships a clean loop into the DAW, the Engineer ships six full-mix takes back into the catalog for me to judge. The three axes:

  • focal · forward_bright vs recessed_warm
  • space · punchy_dry vs spacious_wet
  • dynamics · modern_compressed vs vintage_open

Six variants render per source. The Mix Faceoff shows them one A/B pair at a time, one pair per axis. Tap A or B to switch playback. Hold to commit a vote. Free-text comment per pair persists into the system. The next render on the same source reads the full comment history, so a space-axis retry that mentions a buried kick will also address a vocal complaint from an earlier focal pair on the same track. The system learns my taste one pair at a time.

LIVE · FACEOFF A vs B
rainforest_post_karoke (chorus 1) FOCAL AXIS
0:00 / 0:18
A forward_bright

REASONING

  • vocal +1.2 dB · pulled up over the bus
  • -3 dB low shelf @ 220 Hz · cleared bass mud under the vocal
  • de-ess @ 6 kHz · 4 dB threshold
  • drums tight · room verb -2 dB
  • plate decay 1.2s · short, present

TRANSLATION

phone: ok car: harsh top club: clean
B recessed_warm

REASONING

  • vocal -0.4 dB · sits inside the mix
  • +1.5 dB low shelf @ 180 Hz · weight under the vocal
  • low-pass @ 14 kHz · softer top
  • room verb on snare · ambience added
  • plate decay 1.8s · longer tail

TRANSLATION

phone: vocal lost car: ok club: collapses

The production pair card. Click A/B (or ←/→), pick a side, leave a comment. Spacebar plays a faux loop with a moving playhead. Stem solo buttons swap which stem plays on both sides simultaneously, so the comparison stays apples-to-apples. After each variant render, five simulated playback chains report what the mix will do on phone, AirPods, laptop, car, and club systems. Mix decisions get made against actual listening contexts, not against the abstract idea of a balanced mix.

Each pair you commit becomes the per-axis winner for that source. Three axes voted, three winners. The Winners surface upstairs consolidates them into one card per track. Variants stop being identified by their axis-and-pole strings ("focal · recessed_warm") and pick up stable color names instead: Citrine, Maroon, Mist, Slate, Pewter, Sienna. The catalog stores them as <source> · <Color> so you can scan a Winners card and remember which mix is which.

Three winners means it's time to crown a king. Tap ♛ on the mix that beats the other two across axes. The card collapses to just the king plus the original (so you can A/B back to the unprocessed take), with a comment box for the note that becomes part of the planner's memory next time it renders. From the crowned state you have one of three forward paths: Publish to Sounds flips it public on the catalog's own /sounds page and tags it with a collection (EP idea, album draft). Make more mixes kicks a fresh batch through the planner with the full vote history plus your crown comment in context. Uncrown drops back to candidate cross-compare. The king card is the king of the loop.

LIVE · CROWNED KING MIX
rainforest_post_karoke.wav ♛ Maroon
  • Original unprocessed source
  • ♛ Maroon your king mix · recessed_warm

kick fits inside the bass now. low end finally feels like one instrument. needs a hair more air on the cymbals but i can live with that.

The Winners card after three pairs voted and one variant crowned. Click ▶ to flash a play state on Original or the king. Click → Publish to Sounds to flip the public state on. + Make more mixes feeds the planner the full vote + crown comment history for a fresh batch. Uncrown drops back to candidate compare. Dismiss hides the source from Winners until a new send-to-Engineer.

The reference catalog shipped on 2026-05-22. Bon Iver, Tame Impala, Frank Ocean, 070 Shake, others I love, each row carrying per-track notes on what works in it. The planner pulls the closest references for the source into its prompt as taste anchors before any variant gets rendered. Closest-reference is weighted similarity across spectral centroid, LUFS, band cosine, stereo width, and BPM when the ref has been audio-analyzed, and tag-affinity when it hasn't. The mixes come back already shaped by what I've told the engine matters.

09

What didn't change underneath

Three weeks of tab consolidations, splits, renames, additions. Through every one of them the contract beneath the UI stayed the same.

The catalog is a set of rows and a set of verbs you can do to them: play, crop, split, promote, send to engineer, publish. Every interaction in every surface is one of those verbs. Surprise and Library got deleted on 2026-05-17 because Buckets did what they did with one fewer page. Engineer and Sounds got added on 2026-05-22 and 2026-05-27 because the row could now feed pipelines, and pipelines needed somewhere to live. None of those moves changed the row or the verbs. The row stayed the contract. The tabs were just where you went to use it.

That's what made the construction tractable. New surface? Take the verbs you need, glue together a new lobby page on top. Surface no longer earning its place? Hide it from nav, keep the route alive for bookmarks, move on. The row is the only thing that has to stay coherent.

Verbs are the contract. Surfaces are disposable. The catalog will still be itself the day I delete the last tab.

10

What it deliberately does not do

A music tool I trust is a music tool that doesn't pretend to author. This list is short on purpose and gets checked before any feature lands.

  • Write Substack, LinkedIn, or Instagram posts without an explicit draft invocation.
  • Propose albums, EPs, or song concepts.
  • Push daily prompts.
  • Listen to voice commands during a session.
  • Touch the desktop.
  • Publish to streaming platforms.
  • Make creative decisions on my behalf.

Every one of those is a feature an adjacent product ships. Every one is a place where the machine would pretend to author. A catalog that emails me an unsolicited "draft album concept" stops being a catalog and becomes a parasocial collaborator. That trade was off the table from the start.

Whisper transcribes. Claude classifies. The Engineer renders variants and reasons about them. The Sounds page hosts what I've published. None of them synthesize. The synthesis happens later, in Live, by hand.

11

Where this points

The point of the build is not the build. The point is more music made.

Adjacent to all of this is a Diary tab that pairs random voice-memo fragments with random catalog audio, so a melody you hummed into your phone in 2023 can surface next to an instrumental sketch you made last week. Coming back to a project six months later, trying to remember what the idea was and where the pieces went, is the friction the catalog was originally built to remove. The Diary is the surface where that friction has nothing left to hide behind.

The public-facing uxjon.com/sounds page is already live. The publishing path is concrete: Send to Engineer from a track in Buckets, vote three pairs in the Mix Faceoff, crown your king on the Winners card, click Publish to Sounds, label the collection. The same database, the same row, the same verbs. The path from "this loop is interesting" to "this track is out" doesn't cross a single tool boundary.

3,750 rows still sit in the inbox bucket. 215 promotable loops are queued. The machine just made choosing cheaper.

Utility, not synthesis. The computer sorts, fingerprints, surfaces, and removes format friction. The artist still chooses, still drags, still arranges.
← back to selected work