Analyze music using a state-of-the-art Audio Language Model that listens directly to the audio waveform. Unlike text-based AI, this model processes the actual sound — identifying harmony, structure, timbre, lyrics, and cultural context through deep music understanding. Supports audio up to 20 minutes (MP3, WAV, FLAC). Two modes: (1) Presets — pass a preset name like catalog_metadata, mood_tags, or full_report for structured, optimized output. (2) Custom prompt — pass a prompt for free-form questions. The full_report preset runs all 13 presets in parallel and returns a comprehensive music intelligence report. Use GET /api/songs/analyze/presets to list available presets.
Your Recoup API key. Learn more.
Music analysis request
Provide exactly one of preset or prompt. Use preset for structured analysis workflows, or prompt for free-form questions.
Name of a curated analysis preset. Use instead of prompt for structured, optimized output. The 'full_report' preset runs all 13 presets in parallel and returns a comprehensive report. See List Analyze Presets for the full list of available presets.
catalog_metadata, mood_tags, lyric_transcription, mix_feedback, song_description, music_theory, similar_artists, sample_detection, sync_brief_match, audience_profile, content_advisory, playlist_pitch, artist_development_notes, full_report "catalog_metadata"
Text prompt or question about the music
1 - 24000"Describe the genre, tempo, and mood of this track."
Public URL to an audio file (MP3, WAV, or FLAC — up to 20 minutes)
"https://example.com/song.mp3"
Maximum number of tokens to generate
1 <= x <= 2048512
Controls output creativity — higher values produce more varied responses
0 <= x <= 20.7
Nucleus sampling probability cutoff
0 <= x <= 10.9
Enable sampling (set true when using temperature or top_p)
false
Music analysis completed successfully
Request status
success Preset used for analysis, when applicable
"catalog_metadata"
Model output for single-preset or custom-prompt analysis. May be plain text or structured JSON depending on the preset.
Full report payload returned only when using the full_report preset
Inference time in seconds