Guide · Published 2026-05-22

How to Replace a Section of an AI Song

Swap a chorus. Fix a verse. Change vocals on a window. The replace_music endpoint with starts_at, ends_at, and instruction, explained with working code.


What "replace" means in an AI music context

Sometimes an AI-generated song is 90% right and 10% wrong. The chorus lacks the energy you wanted, or the bridge feels weak, or the vocals in the second verse drift off-style. Without the right tool, your only options are to regenerate the whole thing (losing the parts that worked) or live with the imperfection.

replace_music is the right tool. It targets a specific time window in your existing clip and re-renders only that segment with new content matching your instruction. The audio outside the window is preserved; only the [starts_at, ends_at] region changes.

Three common motivations developers use replace for:

  1. Section-level fixes: "the chorus needs more energy" or "the bridge feels weak" without rebuilding the whole track.
  2. Vocal swaps: change vocal style, swap lyrics, or replace a voice on a specific verse without re-rendering the instrumental backing.
  3. Instrumental swaps: keep the vocals, change the arrangement underneath them for a specific section.

The replace_music endpoint at a glance

On MusicAPI's Producer API (Google Lyria 3 Pro), the endpoint is POST /api/v1/producer/create with task_type: "replace_music". The four parameters that matter:

  • clip_id (required): the source clip containing the segment you want to replace.
  • starts_at (required): start of the replacement window, in seconds from the beginning of the source.
  • ends_at (required): end of the replacement window. Must be greater than starts_at.
  • instruction (recommended): what to put in that window. Be specific.

Use case: fix a weak chorus

Your hook is great. The verse works. The chorus underwhelms. Target just the chorus window and re-render:

curl -X POST https://api.musicapi.ai/api/v1/producer/create \
  -H "Authorization: Bearer $MUSICAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_type": "replace_music",
    "clip_id": "<source-clip-id>",
    "starts_at": 30,
    "ends_at": 50,
    "instruction": "Replace this chorus with a more energetic, layered version. Add backing vocals, doubled bass, a wider arrangement. Keep the melodic hook intact but make it bigger."
  }'

The result is a new clip with the chorus replaced. The intro, verse, and outro outside [30, 50] are preserved.

Use case: swap vocals on a verse (replaces swap_music_vocals)

The legacy swap_music_vocals operation was retired in the 2026-04 model platform migration. The migration path is replace_music targeting the vocal window with a vocal-focused instruction:

curl -X POST https://api.musicapi.ai/api/v1/producer/create \
  -H "Authorization: Bearer $MUSICAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_type": "replace_music",
    "clip_id": "<source-clip-id>",
    "starts_at": 60,
    "ends_at": 90,
    "instruction": "Replace the vocal performance in this window with a smooth male R&B vocal with subtle autotune. Keep the instrumental groove and chord progression exactly the same. Lyrics: When I see you smile, time stands still, every moment with you is a perfect thrill.",
    "lyrics": "[Verse 2]\nWhen I see you smile\nTime stands still\nEvery moment with you\nIs a perfect thrill"
  }'

The model uses the instruction + the optional lyrics field to drive the vocal re-render. The instrumental underneath stays as close as the model can preserve.

Use case: swap instrumental backing (replaces swap_music_sound)

Same pattern, inverted: keep the vocals (the model has them as reference), change the instrumental backing.

curl -X POST https://api.musicapi.ai/api/v1/producer/create \
  -H "Authorization: Bearer $MUSICAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_type": "replace_music",
    "clip_id": "<source-clip-id>",
    "starts_at": 0,
    "ends_at": 60,
    "instruction": "Re-arrange this section with ambient atmospheric pads, soft piano melodies, gentle reverb, and minimalist production. Keep the vocals identical in character and lyric content."
  }'

Picking the right time window

starts_at and ends_at define the segment to replace. Three heuristics:

Window sizeWhat it capturesWhen to use
5-10 secondsA specific phrase, fill, or moment.Fixing a single bad moment. Be precise about what to change in the instruction.
15-30 secondsA song section (verse, chorus, bridge).The sweet spot for most production-level fixes.
30-60 secondsA larger arrangement chunk or a full verse-plus-chorus.When you want to re-think a meaningful portion of the song.
Whole song lengthEquivalent to cover_music with strength 0.6+.Don't. Use cover_music if you want the whole song reinterpreted.

Instruction patterns that work

  • State what to keep, not just what to change. "Replace the chorus arrangement but keep the vocal melody and chord progression" gives the model an anchor.
  • Be specific about new content. "Add backing vocals, doubled bass, and a wider arrangement" outperforms "make it bigger."
  • For vocal swaps, name the vocal style. "Smooth male R&B with subtle autotune" tells the model exactly what register to aim for.
  • For instrumental swaps, name the instruments explicitly. "Ambient pads, soft piano, gentle reverb" beats "more chill."
  • Pass new lyrics when replacing a vocal segment. The lyrics field can carry the new lyric content alongside the instruction.

Production best practices

  1. Validate the time window is inside the source clip. ends_at must be less than the source clip's duration. Submitting ends_at past the end produces undefined behavior.
  2. Don't chain replaces on the same window repeatedly. Each replace re-renders. Iteration drift accumulates. If your first replace doesn't land, change the instruction substantially before trying again rather than tweaking small parameters.
  3. Use seed for A/B tests. Same input + same seed produces the same output. Useful when you're tuning instruction wording without confounding from generation randomness.
  4. Combine with extend_music for full restructuring. Replace fixes a section; extend continues forward. Together they can rebuild meaningful portions of a song without rebuilding from scratch.
  5. Cache by (clip_id + starts_at + ends_at + instruction + seed). Don't pay twice for the same render.

Pricing

12 credits per replace task on MusicAPI's Producer API. Same flat cost as create, extend, and cover. Effective per-replace cost:

  • $0.18 on the $5 entry pack
  • $0.13 on Starter $19/mo
  • $0.09 on Growth $99/mo
  • $0.06 on Pro $999/mo

See Lyria 3 Pro pricing for the full plan economics.

Common questions

What does the replace_music endpoint do?

replace_music swaps a specific time window of an existing audio clip with new content matching your instruction. You pass clip_id (the source), starts_at (window start in seconds), ends_at (window end), and instruction (what to put in that window). The model preserves the audio outside the [starts_at, ends_at] window and re-renders only the targeted segment.

What's the difference between extend_music and replace_music?

Extend continues a song forward from a timestamp, producing new audio after the source. Replace targets a specific time window inside the source and swaps that segment with new content. Use extend to make a song longer; use replace to fix or change a part of a song you already have.

Can I use replace_music to swap vocals?

Yes. Target the time window where the vocals are present (starts_at and ends_at marking the vocal section), and instruct the model to re-render with different vocal style or new lyrics. This is the recommended migration path from the legacy swap_music_vocals operation that was retired in the 2026-04 model platform migration.

Can I use replace_music to swap instrumental backing?

Yes. Same pattern: target the window and instruct the model to re-render with a different instrumental arrangement while keeping the vocal content the model has reference to. This replaces the legacy swap_music_sound operation.

How much does replace_music cost?

12 credits per replace task on MusicAPI's Producer API. Same flat cost as create_music, extend_music, and cover_music. Failed upstream replaces are auto-refunded. Effective cost ranges from $0.06-0.18 per replace depending on plan.

What's the smallest time window I can target?

Practically, around 5 seconds. Below that the model has too little context to produce coherent output. The endpoint accepts any starts_at/ends_at pair where ends_at > starts_at, but you'll get muddled output below 5 seconds. The sweet spot is 10-30 seconds for chorus/verse-level edits.

Will the audio outside my window stay exactly the same?

The model preserves the audio content outside the window but the boundary handling means you may hear a very brief crossfade at starts_at and ends_at. For most production use this is invisible. If you need bit-perfect preservation of the unmodified regions, do client-side splicing: send the original audio + the replace output to your audio toolchain and splice precisely at your chosen timestamps.

Try it

Last updated 2026-05-22.