Sonauto
Last reviewed
May 20, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 3,523 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 20, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 3,523 words
Add missing citations, update stale details, or suggest a clearer explanation.
Sonauto is a generative artificial intelligence music platform that converts text prompts, lyrics, and melody inputs into complete songs with vocals and instrumentation. The company was founded in 2023 by Cornell alumni Ryan Tremblay and Hayden Housen, headquartered in San Francisco, and was accepted into the Y Combinator Winter 2024 (W24) batch.[^1][^2] Its core music model is branded Melodia, originally developed as a latent diffusion model with a variational autoencoder bottleneck and a diffusion transformer, an architectural choice the founders publicly contrasted with the token language model approach used by larger competitors such as [[suno|Suno]] and [[udio|Udio]].[^3] Sonauto launched its first public version in March 2024, exposed a developer API for the Melodia model in March 2025, and is notable in the [[ai_music_generation|AI music generation]] sector for offering an unlimited free consumer tier alongside metered paid API plans.[^4][^5]
| Field | Detail |
|---|---|
| Legal/brand name | Sonauto |
| Founded | 2023 (incorporated); product launched 2024[^1] |
| Founders | Ryan Tremblay (CEO), Hayden Housen (CTO)[^1][^6] |
| Headquarters | San Francisco, California, United States[^1] |
| Y Combinator batch | Winter 2024 (W24)[^1] |
| Team size | Approximately 3 to 4 people in 2025[^1][^7] |
| Flagship model | Melodia (v1.0, v2 Beta, v2.2, v3 preview)[^4][^5][^8] |
| Web product | sonauto.ai (consumer); api.sonauto.ai (developer)[^5][^8] |
| Funding (publicly reported) | Approximately $500,000 seed via Y Combinator[^7] |
| Primary YC partner | Jared Friedman[^1] |
Sonauto was started by Ryan Tremblay and Hayden Housen, who both graduated from Cornell University. Tremblay studied computer science with a machine learning focus and history, and worked on engineering teams at earlier-stage artificial intelligence and creator-economy startups before founding Sonauto. He has publicly stated that he spent roughly a year and a half researching music AI in collaboration with the Harmonai community (the music research collective associated with [[stability_ai|Stability AI]]) and other open-source researchers prior to incorporating the company.[^1] Housen studied computer science at Cornell, published research on paraphrase identification co-authored with Cornell faculty, authored an early open-source transformer-based text summarization library called TransformerSum, and interned in machine learning at Ada Support before co-founding Sonauto.[^6]
The pair incorporated the company in 2023, with the explicit goal of building a music foundation model that would let users generate radio-quality songs from text descriptions and lyrics without conventional music-production skills.[^1] The product strategy emphasized two priorities that would persist through the company's history: a controllable [[diffusion_model|diffusion model]]-based generator rather than a token language model, and a free consumer tier intended to accumulate user feedback and training signal at scale.[^3]
Sonauto was admitted to Y Combinator's Winter 2024 batch, with the company's listing on the Y Combinator directory describing it as "an AI music editor that turns prompts and lyrics into full songs in any style."[^1] The product was first surfaced to a wider technical audience through a "Show HN" submission on Hacker News on 10 April 2024, posted by Tremblay under the username "zaptrem," in which he described the model architecture in some detail and invited public feedback.[^3] The post reached the front page of Hacker News with 454 points and 235 comments, drawing extensive discussion of the platform's free tier, its use of celebrity-styled vocal generations, and its diffusion-based approach.[^3]
A separate Hacker News thread on 10 April 2024 titled "Shoot Out Between the Udio, Sonauto and Suno AI Music Makers" compared Sonauto directly against the two larger, well-funded competitors, drawing further attention to the smaller startup.[^9] In April 2024 the music industry publication Music Ally covered Sonauto, alongside Soundry AI and SongSens AI, in a roundup of new AI music startups, observing that Sonauto's homepage at the time featured AI-generated vocals styled after well-known artists, a presentation choice the publication noted "may not go down as well with the music industry" given the contemporaneous tensions over training data and voice likeness.[^10]
Sonauto participated in Y Combinator's W24 Demo Day in April 2024.[^11] Public funding records aggregated by data providers such as Crunchbase, PitchBook, and Tracxn report a single seed-stage funding round totaling roughly US$500,000, attributed primarily to Y Combinator's standard MFN-SAFE investment for batch companies plus participation from accelerator-network investors. Reported additional named investors include Calm Ventures, Gaingels, Palm Drive Capital, Pioneer Fund, and Rebel Fund.[^7] As of mid-2026, no public reporting documents a follow-on priced seed or Series A round.
The Newcomer newsletter's coverage of pre-Demo-Day W24 funding in late March 2024 noted that Rebel Fund had backed Sonauto ahead of the demo day pitch event, consistent with the named investor list reported by Tracxn.[^11]
Sonauto's public product evolved through several named milestones after its initial release:
By September 2025, third-party SaaS database Latka reported Sonauto's annual recurring revenue at approximately $330,000 with a team of three, based primarily on its developer API plans.[^7]
In Tremblay's public Hacker News description of the v1 model in April 2024, Sonauto's Melodia v1 was characterized as a [[latent_diffusion|latent diffusion model]]: a [[variational_autoencoder|variational autoencoder]] bottleneck compresses raw audio into a continuous, approximately normally distributed latent space, and a [[diffusion_transformer|diffusion transformer]] (described by the founders as "like [[sora|Sora]]") is then trained to denoise latents conditioned on a text/lyric prompt embedding.[^3] Tremblay contrasted this with what was, in 2024, the dominant approach for full-song generation, in which raw audio is tokenized via a residual vector-quantized variational autoencoder (RVQ-VAE) and then modeled with an autoregressive token-level language model. He argued that the diffusion approach provided "interesting properties that make controlling them easier," citing rhythm conditioning (uploading percussion or setting BPM) and variation generation as examples enabled by the continuous latent representation.[^3]
The founders also claimed v1 of Melodia was "the first audio diffusion model to generate coherent lyrics," and audio was generated in 44.1 kHz stereo, though the founders noted that compression imposed by the VAE bottleneck did affect fidelity relative to uncompressed source audio.[^3]
To support a free unlimited tier, Sonauto built its own inference infrastructure rather than relying on external GPU inference providers, a choice the founders described publicly as a deliberate strategy to keep marginal generation cost low enough to support unlimited free consumer use.[^3]
The v2 Beta model, launched in January 2025, retained the latent diffusion approach but, per Tremblay's launch description, used a larger model and an improved generative adversarial network (GAN) decoder relative to v1. The launch claimed substantial gains in vocal quality, audio fidelity, and stylistic diversity, with Tremblay framing the diversity goal as a deliberate contrast with competitors whose models he characterized as collapsing many prompts into generic modern pop output.[^4][^12]
The publicly available description of v3 on Sonauto's developer site describes it as built on a "language model architecture" that "delivers dramatically lower compute costs per generation," with consistent latency and approximately 15-second time-to-first-audio for streaming responses.[^5] This represents a partial inversion of the company's earlier positioning, in which the latent diffusion approach was promoted as a primary differentiator. As of mid-2026, Sonauto has not published peer-reviewed papers or detailed model cards documenting the precise architecture of v3, training data sources, parameter counts, or training compute, and these specifics remain undisclosed in public-facing documentation.
Across versions, the Melodia API exposes several non-generation operations that are characteristic of the broader [[ai_music_generation|AI music generation]] tooling space:
The consumer-facing product at sonauto.ai accepts a free-text prompt describing the desired song (genre, mood, instrumentation, lyrical theme) and optionally user-provided lyrics, then produces a finished song with vocals and instrumentation. The interface offers "Simple" prompt entry, a "Fancy" mode in which users can specify genre, tempo, and key, and an instrumental mode. The platform also includes social features (following, comments, playlists, staff-picked tracks, search, and trending) and a remix-oriented community structure introduced in v2 Beta.[^12]
Generated track length depends on model version. Multiple secondary reviews describe v1 outputs typically in the range of approximately 1.5 minutes, with the v3 preview supporting up to roughly 4.5-minute song lengths.[^13] The consumer site has consistently been marketed as offering unlimited free generation without a credit system or daily cap, though Sonauto's documentation has indicated that some preview model tiers may be limited or restricted to paid users, and the company has reserved the right to introduce future restrictions.[^13]
The Melodia developer API is documented at sonauto.ai/developers. Generation requests are submitted as HTTP POST requests to versioned endpoints (for example https://api.sonauto.ai/v1/generations/v3), return a task identifier for asynchronous polling or webhook delivery, and support multiple output formats. Each baseline song generation consumes 100 credits, with multi-track requests consuming additional credits proportionally.[^5]
Sonauto's published API pricing tiers as of 2026 are:[^5]
| Plan | Price (USD per month) | Monthly credits | Approximate songs | Overage rate |
|---|---|---|---|---|
| Free trial | $0 (on signup) | 1,500 | ~15 | n/a |
| Starter | $11 | 20,000 | ~200 | $0.06 / 100 credits |
| Pro | $88 | 160,000 | ~1,600 | $0.06 / 100 credits |
| Scale | $330 | 660,000 | ~6,600 | $0.05 / 100 credits |
| Enterprise | $1,150 | 2,875,000 | ~28,750 | $0.04 / 100 credits |
In addition to direct API access, Sonauto v2.2 is offered via the third-party GPU inference platform [[fal_ai|fal.ai]] at a per-call rate of US$0.075 per generation, exposing three endpoints (Text to Music, Extend, and Inpaint) with confirmed support for English, Spanish, French, and German lyric generation.[^8]
Sonauto's reception has been most concentrated within technical and prosumer communities rather than in mass-market music press. Its initial Hacker News launch in April 2024 was a top-ranked post for the day, and reviewers in the AI tooling space have repeatedly noted the company's unusual decision to offer an unlimited free consumer tier in a market where the leading competitors (Suno and Udio) operate freemium models with daily credit limits.[^3][^9][^13]
Reviews published in 2025 and 2026 describe Sonauto's output quality as competitive on instrumental texture and stylistic diversity but generally weaker than [[suno|Suno]] on vocal phrasing and emotional dynamics, particularly at slower tempos.[^14] The platform has also been singled out by reviewers and by the IRCAM Amplify research group, which announced in 2025 that its AI music detector tool had been updated to identify Sonauto-generated audio, indicating that Sonauto-generated tracks were appearing in datasets in volumes large enough to warrant detector coverage.[^15]
By September 2025, Sonauto reported annual recurring revenue of approximately $330,000, generated primarily by paid API customers, against a team of three.[^7] No public statistics for monthly active users have been disclosed.
Sonauto operates in a crowded segment of [[ai_music_generation|AI music generation]] tools. Its primary differentiators are its smaller scale, its persistent free consumer tier, and its explicit (if increasingly mixed) bet on diffusion-based generation rather than token language models.
| System | Producer | First public release | Generation type | Notes |
|---|---|---|---|---|
| Sonauto (Melodia) | Sonauto (YC W24) | March/April 2024[^3] | Latent diffusion (v1, v2); language model (v3 preview)[^3][^5] | Free consumer tier; metered API[^5][^13] |
| [[suno | Suno]] | Suno, Inc. | December 2023 (v1)[^16] | Token language model on RVQ tokens[^16] |
| [[suno_v5 | Suno v5]] | Suno, Inc. | 2025[^16] | Updated token language model |
| [[udio | Udio]] | Uncharted Labs | April 2024 | Token language model |
| [[stable_audio | Stable Audio]] | [[stability_ai | Stability AI]] | 2023 |
| [[stable_audio_2_5 | Stable Audio 2.5]] | [[stability_ai | Stability AI]] | 2025 |
| [[elevenlabs_music | ElevenLabs Music]] | [[elevenlabs | ElevenLabs]] | 2025 |
| [[musicgen | MusicGen]] | Meta (FAIR) | 2023 (part of [[audiocraft | AudioCraft]]) |
| [[boomy | Boomy]] | Boomy Corporation | 2018 | Rules and statistical models with later neural extensions |
Diffusion-based approaches in this segment trace back to and overlap with the Harmonai community's open work on latent audio diffusion and to Stable Audio's published research on long-form latent diffusion for music. The token language model approach used by Suno, Udio, and MusicGen follows a different lineage in which audio is first compressed to discrete tokens via a residual quantizer and then modeled autoregressively.
Generative music platforms operate under the same baseline United States copyright framework that has applied to all generative AI output since 2023. The US Copyright Office's March 2023 statement of policy on works containing material generated by AI established that purely AI-generated content without sufficient human authorship is not eligible for copyright protection in the United States, while works in which a human contributed substantial original expression to selection, arrangement, or modification of AI output may be registrable for that human contribution. This guidance applies to Sonauto-generated music in the same way it applies to output from Suno, Udio, Stable Audio, and similar systems.
Sonauto's consumer terms have, as of mid-2026, been described by reviewers as providing limited public detail about commercial use of generated tracks compared to competing platforms; multiple independent reviewers have specifically advised users to verify commercial usage rights against the current Terms of Service rather than relying on third-party summaries.[^13][^14]
In June 2024, the Recording Industry Association of America (RIAA), acting on behalf of major labels including UMG, filed two separate copyright infringement lawsuits: one against [[suno|Suno]] in the US District Court for the District of Massachusetts (see [[riaa_v_suno|UMG v. Suno]]), and one against Uncharted Labs (the operator of [[udio|Udio]]) in the US District Court for the Southern District of New York (see [[riaa_v_udio|UMG v. Uncharted Labs (Udio)]]). Both suits alleged that the defendants had trained their models on copyrighted sound recordings without permission, and both defendants subsequently acknowledged training on unlicensed material while asserting a fair use defense.[^17]
Sonauto was not named in either suit. As of mid-2026, no public reporting documents litigation by major US rights holders against Sonauto specifically, nor has Sonauto issued a public statement of position on the RIAA actions equivalent to those of Suno and Udio. Reviewers and trade press have noted Sonauto's earlier use of vocal generations styled after well-known artists in its homepage examples, which drew negative coverage from music-industry publications in 2024 and which raises legal questions in the same general category as those addressed by the RIAA lawsuits against the larger competitors.[^10]
Public criticism and review of Sonauto in 2025 and 2026 has converged on several recurring limitations: