Lyric alignment — word-timed lyrics

Turn a finished track into a synced lyric timeline: word- and line-level timestamps you can use for karaoke highlighting, lyric videos, or caption tracks. Exact request/response schema is in the interactive API reference; this guide covers when to use it, what it returns, and how it's billed.

Endpoint

| Endpoint | Output | Credits | |---|---|---| | POST /api/v1/sonic/aligned-lyrics | Alignment timeline (JSON) | 1 (first time per clip; cached free after) |

Pass the clip_id of a generated track that already has lyrics. The first alignment of a clip costs 1 credit; the result is cached, so every later request for the same clip_id returns from cache at no charge.

curl -X POST https://api.musicapi.ai/api/v1/sonic/aligned-lyrics \
  -H "Authorization: Bearer $MUSICAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "clip_id": "YOUR_CLIP_ID" }'
{
  "code": 200,
  "message": "success",
  "data": { "alignment": [ /* timed lyric segments */ ] }
}

What you get

data.alignment is an array mapping segments of the lyrics to their start and stop times in the track. Use it to:

  • Karaoke / word highlighting — advance the highlight as playback time crosses each segment's start.
  • Lyric videos — render timed text over the audio without hand-syncing.
  • Captions / subtitles — generate a caption track aligned to the vocal.

Notes

  • The clip must already have lyrics. A clip with no lyrics returns a not-found response.
  • Alignment is computed server-side (forced alignment) — there's no model or GPU for you to host.
  • Because results are cached per clip_id, it's safe and free to re-fetch the alignment whenever your client needs it.