Use case
Lyric Alignment API
Developers building karaoke, lyric-video, or caption features who need lyrics timed to the audio, not just the lyric text.
The problem
A track plus its lyrics text is not enough to highlight words in time or render a synced lyric video. Aligning each word to the audio yourself means running a forced-alignment model, tuning it per track, and hosting the inference.
Why MusicAPI fits this
One call, then free: POST a clip_id and get back a timestamped alignment array. The first alignment of a clip costs 1 credit; repeat requests for the same clip are served from cache at no charge.
Ready-to-render timeline: the response maps lyric segments to start and stop times, so you can drive karaoke-style word highlighting, lyric videos, or caption tracks directly.
No model to host: forced alignment runs on our side and you receive plain JSON — no GPU, model weights, or alignment pipeline to operate.
Code sample
A real request against the live API: start a job, then poll the task endpoint until the audio is ready.
# Get the word/line alignment timeline for a finished clip
curl -X POST https://api.musicapi.ai/api/v1/sonic/aligned-lyrics \
-H "Authorization: Bearer $MUSICAPI_KEY" \
-H "Content-Type: application/json" \
-d '{"clip_id": "YOUR_CLIP_ID"}'
# -> { "code": 200, "message": "success",
# "data": { "alignment": [ { /* timed lyric segments */ } ] } }Pricing
MusicAPI is pay-as-you-go with credit packs, plus predictable monthly subscriptions. The per-credit rate is the same across packs and subscriptions. See the pricing page for current rates, free credits, and volume options.
Related: Suno API · Producer AI API
FAQ
What does the alignment contain?
A timeline array that maps segments of the lyrics to their start and stop times in the track. It is structured for word- or line-level synchronization, so you can highlight lyrics as the song plays or build a lyric video.
How much does it cost?
One credit the first time you align a given clip. The result is cached, so every later request for the same clip_id is returned for free.
Do I need to supply the lyrics?
No. The alignment is derived from a generated clip that already has lyrics — pass its clip_id. A clip with no lyrics returns a not-found response.
Build it in 5 minutes
Get free credits on signup and run real generations before any payment. No credit card required to start.
API details verified 2026-06-07. The API surface evolves; the pricing page always has current rates.