Use case

Text to Music API

Developers who want one endpoint that takes a written prompt and returns a finished, mixed song without running models themselves.

The problem

Most teams do not have the GPU budget, the audio-ML expertise, or the time to host and tune a music model. They have a product idea that needs a song from a sentence of text, and they need it to work the same way every time. Self-hosting turns a feature into an infrastructure project.

Why MusicAPI fits this

One POST request takes a text prompt and returns a full song with vocals, lyrics, structure, and a final mix.

Pay-as-you-go credits and predictable subscriptions, so a text-to-music feature does not require a GPU cluster.

Full commercial rights on every generation, so output can ship inside a paid product.

Code sample

A real request against the live API: start a job, then poll the task endpoint until the audio is ready.

curl
# Text prompt to a finished song with vocals
# 1. Start a generation job
curl -X POST https://api.musicapi.ai/api/v1/sonic/create \
  -H "Authorization: Bearer $MUSICAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "custom_mode": true,
    "mv": "sonic-v5",
    "prompt": "[Verse] Morning light across the kitchen floor [Chorus] We are wide awake and ready for more",
    "tags": "indie pop, warm, acoustic guitar",
    "title": "Wide Awake",
    "make_instrumental": false
  }'

# Response: { "message": "success", "task_id": "a1b2c3d4-..." }

# 2. Poll until the audio is ready (state: "succeeded")
curl https://api.musicapi.ai/api/v1/sonic/task/a1b2c3d4-... \
  -H "Authorization: Bearer $MUSICAPI_KEY"

# Response data[].audio_url is a ready-to-stream MP3 once state is "succeeded".

Pricing

MusicAPI is pay-as-you-go with credit packs, plus predictable monthly subscriptions. The per-credit rate is the same across packs and subscriptions. See the pricing page for current rates, free credits, and volume options.

FAQ

Do I send just a description or actual lyrics?

Both work. Set custom_mode to true and pass your own lyrics in prompt for precise control, or pass a short description in gpt_description_prompt and let the model write the lyrics and structure for you.

How long does a text-to-music generation take?

A job typically returns finished audio within a couple of minutes. You start the job with one POST, then poll the task endpoint until state is succeeded. You can also register a webhook so you are notified instead of polling.

What audio formats come back?

Each finished generation returns a streamable MP3 audio_url. WAV and stem exports are available through the dedicated WAV and stems endpoints when you need higher-fidelity or multi-track output.

Build it in 5 minutes

Get free credits on signup and run real generations before any payment. No credit card required to start.

API details verified 2026-05-16. The API surface evolves; the pricing page always has current rates.