Use case
Text to Music API
Developers who want one endpoint that takes a written prompt and returns a finished, mixed song without running models themselves.
The problem
Most teams do not have the GPU budget, the audio-ML expertise, or the time to host and tune a music model. They have a product idea that needs a song from a sentence of text, and they need it to work the same way every time. Self-hosting turns a feature into an infrastructure project.
Why MusicAPI fits this
One POST request takes a text prompt and returns a full song with vocals, lyrics, structure, and a final mix.
Pay-as-you-go credits and predictable subscriptions, so a text-to-music feature does not require a GPU cluster.
Full commercial rights on every generation, so output can ship inside a paid product.
Code sample
A real request against the live API: start a job, then poll the task endpoint until the audio is ready.
# Text prompt to a finished song with vocals
# 1. Start a generation job
curl -X POST https://api.musicapi.ai/api/v1/sonic/create \
-H "Authorization: Bearer $MUSICAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"custom_mode": true,
"mv": "sonic-v5",
"prompt": "[Verse] Morning light across the kitchen floor [Chorus] We are wide awake and ready for more",
"tags": "indie pop, warm, acoustic guitar",
"title": "Wide Awake",
"make_instrumental": false
}'
# Response: { "message": "success", "task_id": "a1b2c3d4-..." }
# 2. Poll until the audio is ready (state: "succeeded")
curl https://api.musicapi.ai/api/v1/sonic/task/a1b2c3d4-... \
-H "Authorization: Bearer $MUSICAPI_KEY"
# Response data[].audio_url is a ready-to-stream MP3 once state is "succeeded".Pricing
MusicAPI is pay-as-you-go with credit packs, plus predictable monthly subscriptions. The per-credit rate is the same across packs and subscriptions. See the pricing page for current rates, free credits, and volume options.
FAQ
Do I send just a description or actual lyrics?
Both work. Set custom_mode to true and pass your own lyrics in prompt for precise control, or pass a short description in gpt_description_prompt and let the model write the lyrics and structure for you.
How long does a text-to-music generation take?
A job typically returns finished audio within a couple of minutes. You start the job with one POST, then poll the task endpoint until state is succeeded. You can also register a webhook so you are notified instead of polling.
What audio formats come back?
Each finished generation returns a streamable MP3 audio_url. WAV and stem exports are available through the dedicated WAV and stems endpoints when you need higher-fidelity or multi-track output.
Build it in 5 minutes
Get free credits on signup and run real generations before any payment. No credit card required to start.
API details verified 2026-05-16. The API surface evolves; the pricing page always has current rates.