lifelessai/aud0

Fork 0

AI-powered audio forensics suite - drop in a track, get psychological, narrative, technical, cultural, and genre analysis via Bytez + Web Audio API

JavaScript 80.7%
CSS 18.8%
HTML 0.5%

Go to file

ibotzhub 147601d3a4 fix(audio): port Web Audio + canvas pipeline from working monolith - Replace stub audio-context.js with analyser graph, isolation chain, and waterfall/neural/particle/stereo animation loops (from harajuku Deck build) - Map viz button modes (spectrum/waterfall/bars, stereo lissajous/phase/spread, particle cloud/orbit/explosion) to internal render paths - State: blob URL revoke, score scratch fields for real-time meters - main: start viz loops after boot, resize after layout, stop API key box clicks from opening the file picker, allow GO without AI (viz-only path) - README: document port source		2026-05-13 02:12:03 -07:00
app	fix(audio): port Web Audio + canvas pipeline from working monolith	2026-05-13 02:12:03 -07:00
legacy	docs(legacy): correct git revs for recovering monolith HTML	2026-05-13 02:03:46 -07:00
.gitignore	chore: archive legacy prototype, add .gitignore, fix README layout	2026-05-13 01:59:48 -07:00
README.md	fix(audio): port Web Audio + canvas pipeline from working monolith	2026-05-13 02:12:03 -07:00

README.md

AUDIO FORENSICS PRO

Advanced AI-powered audio analysis and deconstruction suite. Drop in an audio file, connect a Bytez API key, and get deep psychological, narrative, technical, cultural, gaming, and genre breakdowns of your track in real time.

Repository: ibotzhub/aud0.

Repo structure

The shipping app lives in the app/ directory (static ES modules: app/index.html plus css/ and js/).

The old single-file HTML prototype is not in this repo anymore (see legacy/README.md for how to recover a copy from git or where similar files often live on disk).

aud0/
  README.md
  .gitignore
  legacy/
    README.md
  app/
    index.html
    css/
      style.css
    js/
      main.js
      ui-builder.js
      ui.js
      audio-context.js
      ai-service.js
      state.js
      storage.js
      utils.js

Getting started

No build step. Serve the app/ folder over HTTP (ES modules will not load from file://).

Clone the repo and cd app/.
Start a static server, for example:

npx serve .

python3 -m http.server 8080

Open the URL the server prints (for python3 -m http.server, usually http://localhost:8080).
Enter a Bytez API key, pick an audio file, and use the app.

The app uses ES modules, so you cannot rely on double-clicking index.html from disk; use HTTP.

AI setup

This app uses the Bytez API for audio understanding.

Classification model: aaraki/wav2vec2-base-finetuned-ks
Chat model: Qwen/Qwen2-Audio-7B-Instruct

Enter your Bytez API key in the upload screen or chat sidebar. The key is stored in localStorage after the first successful save.

Audio window setting

The "Audio window" dropdown controls how much audio is sent per request:

Setting	Behavior
10-60 seconds	Captures a live clip of that length from the playing audio
Full file	Sends the entire file as base64 (max 6 MB) or uploads to a temp host if larger

If your file is over 6 MB and clip mode is off, the app may upload the file to a temporary public host and pass the URL to the API.

Module overview

File	Purpose
`main.js`	Entry point, event wiring, orchestrates everything
`ui-builder.js`	Builds the DOM, exports the `ui` element reference object
`ui.js`	Toasts, metadata rendering, AI card rendering
`audio-context.js`	Web Audio API setup, visualisation animation loops
`ai-service.js`	Bytez API calls, audio capture/encoding, key management
`state.js`	Shared mutable state passed between modules
`storage.js`	Thin `localStorage` wrapper with error handling
`utils.js`	JSON extraction, file download, audio helpers

Paths above are under app/js/ except app/index.html and app/css/style.css.

Analysis sections

Each section is collapsible and AI-populated when you load a file:

Psychological - emotional tone, cognitive load, listener affect
Narrative Structure - story arc, tension, resolution patterns
Technical - frequency composition, dynamics, production quality
Cultural Context - genre genealogy, regional influences, zeitgeist
Gaming Applications - sync opportunities, gameplay mood matching
Genre and Mood - genre tags, similar artists, playlist placement

Each section has a Refresh control to re-query the AI independently.

Power-ups

Below the visualisations. Stubbed, ready for implementation:

Power-up	Description
MIDI Export	Convert detected melody to MIDI
Isolate Elements	Stem separation
Beat Detective	BPM, grid, and swing detection
Key and Chords	Key, chord, and harmony identification
Genre Analysis	Deep genre and cultural mapping

Visualisations

Four real-time canvas visualisers, each with multiple display modes:

Frequency Waterfall - Spectrum, Waterfall, Bars
Neural Network - Network, Nodes, Flow
Particle Field - Cloud, Orbit, Explosion
Stereo Field - Lissajous, Phase, Spread

Browser requirements

Modern browser with ES module support (Chrome, Firefox, Edge, Safari 14+)
Web Audio API
MediaRecorder for live clip capture (Chrome/Firefox; Safari partial)

Notes

No framework or bundler; Google Fonts load from a CDN.
Shared state lives in state.js. Avoid circular imports around state.
The ui object from ui-builder.js is the single source of truth for DOM references. Do not use document.getElementById outside ui-builder.js.
The app/js/audio-context.js module contains the Web Audio routing and canvas visualisers, ported from the working single-file build (same logic as Documents/harajuku.tech/audio_forensics_fixed1.html, May 2026). Power-up buttons in main.js are still placeholders until those flows are wired to Bytez or local DSP helpers.