Stability AI pushes music generation into longer formats
Stability AI is broadening its audio ambitions with a new family of music and sound models that aim to make AI-generated audio longer, more flexible, and easier to deploy across different devices. The company says its new Stability Audio 3.0 lineup includes four models ranging from compact systems meant for on-device use to larger ones capable of generating full musical pieces lasting more than six minutes.
The new release matters for two reasons. First, it substantially extends generation length compared with earlier open versions from the company. Second, it reflects a more segmented strategy for AI audio, where deployment target and licensing model are becoming just as important as raw quality. Stability is not shipping one universal model. It is shipping a portfolio.
Four models, different use cases
According to the supplied source text, the Stability Audio 3.0 family includes small SFX, small, medium, and large variants. The two smaller models each have 459 million parameters and are intended for on-device sound and music generation, supporting tracks of up to two minutes. The medium model comes in at 1.4 billion parameters, and the large model at 2.7 billion.
For users focused on full songs rather than short clips, the bigger shift is at the upper end of the range. Stability says the medium and large models can generate compositions up to 6 minutes and 20 seconds while maintaining melodic tone and overall musical structure. That is more than double the length supported by Stability Audio 2.0, released in 2024, and far beyond the 47-second limit of the earlier Stable Audio Open release.
Length is not just a cosmetic metric in music generation. Short clips can work for effects, loops, and concepting, but longer-form generation raises the possibility of more complete demos, soundtrack sketches, and draft compositions. That makes the models more relevant to creators who need continuity and development rather than isolated audio moments.
Open weights, with limits
Stability is also drawing a line between what it wants broadly adopted and what it plans to commercialize more tightly. The company is making the small SFX, small, and medium models available with open weights, allowing developers and researchers to use and modify them. The large model, by contrast, is being kept behind API and paid self-hosting options. Companies with more than $1 million in revenue will need an enterprise license.
This structure says a lot about where the market is heading. Open-weight releases remain a powerful distribution tool, especially for developer goodwill and ecosystem growth. But the most capable model often becomes the monetized tier, especially when inference costs and enterprise demand rise. Stability is following a pattern already familiar in image and language AI: openness as a growth engine, controlled access as the business layer.
The licensing question is now central
The other major issue hanging over the music-generation sector is training data. The supplied source text places Stability’s release in the context of ongoing legal pressure around music AI, pointing to the court fights involving Suno and Udio. In this environment, licensing is not a side note. It is one of the core competitive variables.
Stability says its latest audio models were built on fully licensed data. That claim is particularly important because long-term commercial viability in AI music may depend less on who can generate a song and more on who can do it with a rights structure that labels, publishers, and enterprise customers will accept. Last year, Stability reached agreements with Warner Music Group and Universal Music Group to develop models and music-creation tools. Those relationships now look less like branding wins and more like infrastructure for survival in a legally contested market.
A bigger play for professional musicians
The release also hints at a wider product strategy. Stability says it is developing a new suite of products for professional musicians, though it did not disclose feature details in the supplied text. It has also hired Ethan Kaplan, formerly chief digital officer at Universal Audio and Fender, to lead its professional music offering.
That move mirrors a broader trend across generative audio companies, many of which are now hiring music-industry executives to bolster credibility and navigate licensing, partnerships, and go-to-market strategy. The technology is improving quickly, but companies increasingly need domain fluency as much as model capability.
- Small models are aimed at on-device generation of up to two minutes.
- Medium and large models target longer compositions up to 6 minutes and 20 seconds.
- Three models are available with open weights, while the largest remains paid and more tightly controlled.
- Stability says the new models were trained on fully licensed data.
Why this release matters
Stability Audio 3.0 does not settle the music-AI debate, and the company’s performance claims will ultimately be judged by creators and developers. But the launch is still a meaningful industry marker. It combines longer-form generation, a mixed open-and-commercial release strategy, and a licensing-first posture at a time when the audio AI market is moving from novelty toward infrastructure. In other words, Stability is no longer just trying to prove that AI can make music. It is trying to show that AI music can be productized, deployed, and commercialized at scale.
This article is based on reporting by TechCrunch. Read the original article.
Originally published on techcrunch.com








