Lyric alignment — word-timed lyrics

Turn a finished track into a synced lyric timeline: word- and line-level timestamps you can use for karaoke highlighting, lyric videos, or caption tracks. Exact request/response schema is in the interactive API reference; this guide covers when to use it, what it returns, and how it's billed.

Endpoint

| Endpoint | Output | Credits | |---|---|---| | POST /api/v1/sonic/aligned-lyrics | Alignment timeline (JSON) | 1 (first time per clip; cached free after) |

Pass the clip_id of a generated track that already has lyrics. The first alignment of a clip costs 1 credit; the result is cached, so every later request for the same clip_id returns from cache at no charge.

curl -X POST https://api.musicapi.ai/api/v1/sonic/aligned-lyrics \
  -H "Authorization: Bearer $MUSICAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "clip_id": "YOUR_CLIP_ID" }'

{
  "code": 200,
  "message": "success",
  "data": { "alignment": [ /* timed lyric segments */ ] }
}

What you get

data.alignment is an array mapping segments of the lyrics to their start and stop times in the track. Use it to:

Karaoke / word highlighting — advance the highlight as playback time crosses each segment's start.
Lyric videos — render timed text over the audio without hand-syncing.
Captions / subtitles — generate a caption track aligned to the vocal.

Notes

The clip must already have lyrics. A clip with no lyrics returns a not-found response.
Alignment is computed server-side (forced alignment) — there's no model or GPU for you to host.
Because results are cached per clip_id, it's safe and free to re-fetch the alignment whenever your client needs it.