# Sonauto

> Source: https://aiwiki.ai/wiki/sonauto
> Updated: 2026-06-07
> Categories: AI Tools & Products, Generative AI, Music & Audio Generation
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

# Sonauto

**Sonauto** is a generative artificial intelligence music platform that converts text prompts, lyrics, and melody inputs into complete songs with vocals and instrumentation. The company was founded in 2023 by Cornell alumni Ryan Tremblay and Hayden Housen, headquartered in San Francisco, and was accepted into the Y Combinator Winter 2024 (W24) batch.[^1][^2] Its core music model is branded **Melodia**, originally developed as a latent diffusion model with a variational autoencoder bottleneck and a diffusion transformer, an architectural choice the founders publicly contrasted with the token language model approach used by larger competitors such as [Suno](/wiki/suno) and [Udio](/wiki/udio).[^3] Sonauto launched its first public version in March 2024, exposed a developer API for the Melodia model in March 2025, and is notable in the [AI music generation](/wiki/ai_music_generation) sector for offering an unlimited free consumer tier alongside metered paid API plans.[^4][^5]

## Company overview

| Field | Detail |
| --- | --- |
| Legal/brand name | Sonauto |
| Founded | 2023 (incorporated); product launched 2024[^1] |
| Founders | Ryan Tremblay (CEO), Hayden Housen (CTO)[^1][^6] |
| Headquarters | San Francisco, California, United States[^1] |
| Y Combinator batch | Winter 2024 (W24)[^1] |
| Team size | Approximately 3 to 4 people in 2025[^1][^7] |
| Flagship model | Melodia (v1.0, v2 Beta, v2.2, v3 preview)[^4][^5][^8] |
| Web product | sonauto.ai (consumer); api.sonauto.ai (developer)[^5][^8] |
| Funding (publicly reported) | Approximately $500,000 seed via Y Combinator[^7] |
| Primary YC partner | Jared Friedman[^1] |

## History

### Origins and founder background

Sonauto was started by Ryan Tremblay and Hayden Housen, who both graduated from Cornell University. Tremblay studied computer science with a machine learning focus and history, and worked on engineering teams at earlier-stage artificial intelligence and creator-economy startups before founding Sonauto. He has publicly stated that he spent roughly a year and a half researching music AI in collaboration with the Harmonai community (the music research collective associated with [Stability AI](/wiki/stability_ai)) and other open-source researchers prior to incorporating the company.[^1] Housen studied computer science at Cornell, published research on paraphrase identification co-authored with Cornell faculty, authored an early open-source transformer-based text summarization library called TransformerSum, and interned in machine learning at Ada Support before co-founding Sonauto.[^6]

The pair incorporated the company in 2023, with the explicit goal of building a music foundation model that would let users generate radio-quality songs from text descriptions and lyrics without conventional music-production skills.[^1] The product strategy emphasized two priorities that would persist through the company's history: a controllable [diffusion model](/wiki/diffusion_model)-based generator rather than a token language model, and a free consumer tier intended to accumulate user feedback and training signal at scale.[^3]

### Y Combinator Winter 2024 and initial launch

Sonauto was admitted to Y Combinator's Winter 2024 batch, with the company's listing on the Y Combinator directory describing it as "an AI music editor that turns prompts and lyrics into full songs in any style."[^1] The product was first surfaced to a wider technical audience through a "Show HN" submission on Hacker News on 10 April 2024, posted by Tremblay under the username "zaptrem," in which he described the model architecture in some detail and invited public feedback.[^3] The post reached the front page of Hacker News with 454 points and 235 comments, drawing extensive discussion of the platform's free tier, its use of celebrity-styled vocal generations, and its diffusion-based approach.[^3]

A separate Hacker News thread on 10 April 2024 titled "Shoot Out Between the Udio, Sonauto and Suno AI Music Makers" compared Sonauto directly against the two larger, well-funded competitors, drawing further attention to the smaller startup.[^9] In April 2024 the music industry publication *Music Ally* covered Sonauto, alongside Soundry AI and SongSens AI, in a roundup of new AI music startups, observing that Sonauto's homepage at the time featured AI-generated vocals styled after well-known artists, a presentation choice the publication noted "may not go down as well with the music industry" given the contemporaneous tensions over training data and voice likeness.[^10]

### Demo day, seed funding, and follow-on coverage

Sonauto participated in Y Combinator's W24 Demo Day in April 2024.[^11] Public funding records aggregated by data providers such as Crunchbase, PitchBook, and Tracxn report a single seed-stage funding round totaling roughly US$500,000, attributed primarily to Y Combinator's standard MFN-SAFE investment for batch companies plus participation from accelerator-network investors. Reported additional named investors include Calm Ventures, Gaingels, Palm Drive Capital, Pioneer Fund, and Rebel Fund.[^7] As of mid-2026, no public reporting documents a follow-on priced seed or Series A round.

The Newcomer newsletter's coverage of pre-Demo-Day W24 funding in late March 2024 noted that Rebel Fund had backed Sonauto ahead of the demo day pitch event, consistent with the named investor list reported by Tracxn.[^11]

### Product evolution (2024 to 2026)

Sonauto's public product evolved through several named milestones after its initial release:

- **v1.0 (April 2024).** Tremblay announced a "version 1.0" of the consumer product on Hacker News on 17 April 2024, roughly a week after the initial Show HN, after a month of beta testing.[^3]
- **v2 Beta (January 2025).** A second-generation model and substantially revised consumer site were launched on 7 January 2025. The v2 release added community features such as follow graphs, comments, playlists, and "groups" similar to subreddit-style communities. Tremblay claimed in launch posts that the model could match the quality and diversity of paid models from much larger competitors while remaining free.[^12]
- **v2.2 (mid-2025).** A point release added user-controllable BPM, song extension to convert 1.5-minute generations into longer tracks, and an audio-editing endpoint that supported in-place modifications of generated audio.[^8][^12]
- **Sonauto API (March 2025).** A developer-facing API exposing generation, inpainting, extension, and transition endpoints for the underlying Melodia model was launched on 3 March 2025, again via a Show HN submission.[^4]
- **fal.ai partnership (August 2025).** Sonauto v2.2 became available on the third-party model-hosting platform [fal.ai](/wiki/fal_ai) on 28 August 2025, offered at a metered rate of US$0.075 per generation across text-to-music, extend, and inpaint endpoints, at 44.1 kHz 16-bit CD-quality stereo.[^8]
- **v3 Preview (late 2025 to 2026).** A v3 preview was opened to users with extended song lengths and additional style tags. According to the Sonauto developer site, the v3 endpoint exposes approximately 4,160 style tags and reaches first audio in roughly 15 seconds. The developer page describes v3 as built on a "language model architecture" delivering "dramatically lower compute costs per generation," signaling a partial architectural shift from the original latent diffusion stack used for v1 and v2.[^5]

By September 2025, third-party SaaS database Latka reported Sonauto's annual recurring revenue at approximately $330,000 with a team of three, based primarily on its developer API plans.[^7]

## Technical approach

### Original Melodia architecture (v1 and v2)

In Tremblay's public Hacker News description of the v1 model in April 2024, Sonauto's Melodia v1 was characterized as a [latent diffusion model](/wiki/latent_diffusion): a [variational autoencoder](/wiki/variational_autoencoder) bottleneck compresses raw audio into a continuous, approximately normally distributed latent space, and a [diffusion transformer](/wiki/diffusion_transformer) (described by the founders as "like [Sora](/wiki/sora)") is then trained to denoise latents conditioned on a text/lyric prompt embedding.[^3] Tremblay contrasted this with what was, in 2024, the dominant approach for full-song generation, in which raw audio is tokenized via a residual vector-quantized variational autoencoder (RVQ-VAE) and then modeled with an autoregressive token-level language model. He argued that the diffusion approach provided "interesting properties that make controlling them easier," citing rhythm conditioning (uploading percussion or setting BPM) and variation generation as examples enabled by the continuous latent representation.[^3]

The founders also claimed v1 of Melodia was "the first audio diffusion model to generate coherent lyrics," and audio was generated in 44.1 kHz stereo, though the founders noted that compression imposed by the VAE bottleneck did affect fidelity relative to uncompressed source audio.[^3]

To support a free unlimited tier, Sonauto built its own inference infrastructure rather than relying on external GPU inference providers, a choice the founders described publicly as a deliberate strategy to keep marginal generation cost low enough to support unlimited free consumer use.[^3]

### v2 Beta improvements

The v2 Beta model, launched in January 2025, retained the latent diffusion approach but, per Tremblay's launch description, used a larger model and an improved generative adversarial network (GAN) decoder relative to v1. The launch claimed substantial gains in vocal quality, audio fidelity, and stylistic diversity, with Tremblay framing the diversity goal as a deliberate contrast with competitors whose models he characterized as collapsing many prompts into generic modern pop output.[^4][^12]

### v3 architectural shift

The publicly available description of v3 on Sonauto's developer site describes it as built on a "language model architecture" that "delivers dramatically lower compute costs per generation," with consistent latency and approximately 15-second time-to-first-audio for streaming responses.[^5] This represents a partial inversion of the company's earlier positioning, in which the latent diffusion approach was promoted as a primary differentiator. As of mid-2026, Sonauto has not published peer-reviewed papers or detailed model cards documenting the precise architecture of v3, training data sources, parameter counts, or training compute, and these specifics remain undisclosed in public-facing documentation.

### Auxiliary capabilities

Across versions, the Melodia API exposes several non-generation operations that are characteristic of the broader [AI music generation](/wiki/ai_music_generation) tooling space:

- **Lyrics alignment.** A flag on the generation endpoint, when enabled, produces word-level timing data aligning the prompt lyrics with the generated audio, used by downstream lyric video and karaoke applications.[^5]
- **Extend.** Takes an existing short generation and extends it to a longer track, used to grow v2's roughly 1.5-minute clips into multi-minute songs.[^4][^8]
- **Inpaint.** Replaces a chosen segment of an existing track with newly generated audio while preserving the surrounding context, structurally analogous to image inpainting in diffusion image models.[^4][^8]
- **Stem separation.** Splits a generation into vocals, drums, bass, and "other" instrumental stems for downstream remixing.[^12]
- **Transition generation.** Creates a generated bridge between two audio clips, exposed via the API.[^4]

## Product and pricing

### Consumer product

The consumer-facing product at sonauto.ai accepts a free-text prompt describing the desired song (genre, mood, instrumentation, lyrical theme) and optionally user-provided lyrics, then produces a finished song with vocals and instrumentation. The interface offers "Simple" prompt entry, a "Fancy" mode in which users can specify genre, tempo, and key, and an instrumental mode. The platform also includes social features (following, comments, playlists, staff-picked tracks, search, and trending) and a remix-oriented community structure introduced in v2 Beta.[^12]

Generated track length depends on model version. Multiple secondary reviews describe v1 outputs typically in the range of approximately 1.5 minutes, with the v3 preview supporting up to roughly 4.5-minute song lengths.[^13] The consumer site has consistently been marketed as offering unlimited free generation without a credit system or daily cap, though Sonauto's documentation has indicated that some preview model tiers may be limited or restricted to paid users, and the company has reserved the right to introduce future restrictions.[^13]

### Developer API and pricing

The Melodia developer API is documented at sonauto.ai/developers. Generation requests are submitted as HTTP POST requests to versioned endpoints (for example `https://api.sonauto.ai/v1/generations/v3`), return a task identifier for asynchronous polling or webhook delivery, and support multiple output formats. Each baseline song generation consumes 100 credits, with multi-track requests consuming additional credits proportionally.[^5]

Sonauto's published API pricing tiers as of 2026 are:[^5]

| Plan | Price (USD per month) | Monthly credits | Approximate songs | Overage rate |
| --- | --- | --- | --- | --- |
| Free trial | $0 (on signup) | 1,500 | ~15 | n/a |
| Starter | $11 | 20,000 | ~200 | $0.06 / 100 credits |
| Pro | $88 | 160,000 | ~1,600 | $0.06 / 100 credits |
| Scale | $330 | 660,000 | ~6,600 | $0.05 / 100 credits |
| Enterprise | $1,150 | 2,875,000 | ~28,750 | $0.04 / 100 credits |

In addition to direct API access, Sonauto v2.2 is offered via the third-party GPU inference platform [fal.ai](/wiki/fal_ai) at a per-call rate of US$0.075 per generation, exposing three endpoints (Text to Music, Extend, and Inpaint) with confirmed support for English, Spanish, French, and German lyric generation.[^8]

## Reception and adoption

Sonauto's reception has been most concentrated within technical and prosumer communities rather than in mass-market music press. Its initial Hacker News launch in April 2024 was a top-ranked post for the day, and reviewers in the AI tooling space have repeatedly noted the company's unusual decision to offer an unlimited free consumer tier in a market where the leading competitors (Suno and Udio) operate freemium models with daily credit limits.[^3][^9][^13]

Reviews published in 2025 and 2026 describe Sonauto's output quality as competitive on instrumental texture and stylistic diversity but generally weaker than [Suno](/wiki/suno) on vocal phrasing and emotional dynamics, particularly at slower tempos.[^14] The platform has also been singled out by reviewers and by the IRCAM Amplify research group, which announced in 2025 that its AI music detector tool had been updated to identify Sonauto-generated audio, indicating that Sonauto-generated tracks were appearing in datasets in volumes large enough to warrant detector coverage.[^15]

By September 2025, Sonauto reported annual recurring revenue of approximately $330,000, generated primarily by paid API customers, against a team of three.[^7] No public statistics for monthly active users have been disclosed.

## Comparison with related AI music systems

Sonauto operates in a crowded segment of [AI music generation](/wiki/ai_music_generation) tools. Its primary differentiators are its smaller scale, its persistent free consumer tier, and its explicit (if increasingly mixed) bet on diffusion-based generation rather than token language models.

| System | Producer | First public release | Generation type | Notes |
| --- | --- | --- | --- | --- |
| Sonauto (Melodia) | Sonauto (YC W24) | March/April 2024[^3] | Latent diffusion (v1, v2); language model (v3 preview)[^3][^5] | Free consumer tier; metered API[^5][^13] |
| [Suno](/wiki/suno) | Suno, Inc. | December 2023 (v1)[^16] | Token language model on RVQ tokens[^16] | Paid freemium; subject of [UMG v. Suno](/wiki/riaa_v_suno) (2024)[^17] |
| [Suno v5](/wiki/suno_v5) | Suno, Inc. | 2025[^16] | Updated token language model | Larger paid user base than Sonauto[^16] |
| [Udio](/wiki/udio) | Uncharted Labs | April 2024 | Token language model | Subject of [UMG v. Uncharted Labs (Udio)](/wiki/riaa_v_udio) (2024)[^17] |
| [Stable Audio](/wiki/stable_audio) | [Stability AI](/wiki/stability_ai) | 2023 | Latent diffusion (timestamp-conditioned) | Strong instrumental focus, weaker on vocals |
| [Stable Audio 2.5](/wiki/stable_audio_2_5) | [Stability AI](/wiki/stability_ai) | 2025 | Latent diffusion | Successor to Stable Audio 2.0 |
| [ElevenLabs Music](/wiki/elevenlabs_music) | [ElevenLabs](/wiki/elevenlabs) | 2025 | Proprietary; emphasis on vocal synthesis | Companion to [ElevenLabs v3](/wiki/elevenlabs_v3) voice model |
| [MusicGen](/wiki/musicgen) | Meta (FAIR) | 2023 (part of [AudioCraft](/wiki/audiocraft)) | Token language model over EnCodec tokens | Open-source research model |
| [Boomy](/wiki/boomy) | Boomy Corporation | 2018 | Rules and statistical models with later neural extensions | Earliest mainstream AI-music platform |

Diffusion-based approaches in this segment trace back to and overlap with the Harmonai community's open work on latent audio diffusion and to Stable Audio's published research on long-form latent diffusion for music. The token language model approach used by Suno, Udio, and MusicGen follows a different lineage in which audio is first compressed to discrete tokens via a residual quantizer and then modeled autoregressively.

## Intellectual property and legal context

### US copyright office guidance

Generative music platforms operate under the same baseline United States copyright framework that has applied to all generative AI output since 2023. The US Copyright Office's March 2023 statement of policy on works containing material generated by AI established that purely AI-generated content without sufficient human authorship is not eligible for copyright protection in the United States, while works in which a human contributed substantial original expression to selection, arrangement, or modification of AI output may be registrable for that human contribution. This guidance applies to Sonauto-generated music in the same way it applies to output from Suno, Udio, Stable Audio, and similar systems.

Sonauto's consumer terms have, as of mid-2026, been described by reviewers as providing limited public detail about commercial use of generated tracks compared to competing platforms; multiple independent reviewers have specifically advised users to verify commercial usage rights against the current Terms of Service rather than relying on third-party summaries.[^13][^14]

### RIAA lawsuits and Sonauto's position

In June 2024, the Recording Industry Association of America (RIAA), acting on behalf of major labels including UMG, filed two separate copyright infringement lawsuits: one against [Suno](/wiki/suno) in the US District Court for the District of Massachusetts (see [UMG v. Suno](/wiki/riaa_v_suno)), and one against Uncharted Labs (the operator of [Udio](/wiki/udio)) in the US District Court for the Southern District of New York (see [UMG v. Uncharted Labs (Udio)](/wiki/riaa_v_udio)). Both suits alleged that the defendants had trained their models on copyrighted sound recordings without permission, and both defendants subsequently acknowledged training on unlicensed material while asserting a fair use defense.[^17]

Sonauto was not named in either suit. As of mid-2026, no public reporting documents litigation by major US rights holders against Sonauto specifically, nor has Sonauto issued a public statement of position on the RIAA actions equivalent to those of Suno and Udio. Reviewers and trade press have noted Sonauto's earlier use of vocal generations styled after well-known artists in its homepage examples, which drew negative coverage from music-industry publications in 2024 and which raises legal questions in the same general category as those addressed by the RIAA lawsuits against the larger competitors.[^10]

## Limitations

Public criticism and review of Sonauto in 2025 and 2026 has converged on several recurring limitations:

- **Vocal quality at low tempos.** Multiple comparative reviews report that vocals can sound noticeably synthetic at slower tempos, with Suno often described as having the more naturalistic vocal phrasing among the major paid systems.[^14]
- **Documentation gaps.** Independent reviewers have noted that the consumer site does not always document detailed model behavior, export formats, sample rates, stem availability, or commercial licensing terms in a single canonical location, requiring users to consult the Terms of Service directly.[^13]
- **Architectural opacity.** Beyond high-level descriptions on Hacker News and on the developer page, Sonauto has not published a model card, technical report, or peer-reviewed paper documenting training data, model size, or evaluation protocols for Melodia. This is in contrast to lab-released systems such as [MusicGen](/wiki/musicgen) and the published [Stable Audio](/wiki/stable_audio_2_5) research papers from Stability AI.
- **Detectability.** As of 2025, third-party AI music detectors such as IRCAM Amplify's tool reported coverage of Sonauto-generated audio, indicating that Sonauto outputs carry detectable model signatures consistent with other contemporary AI music systems.[^15]
- **Unconfirmed long-term sustainability of free tier.** Sonauto's free consumer tier has been supported by its own inference infrastructure, but reviewers have repeatedly noted that the unlimited free model may not be sustainable indefinitely as the company scales, and Sonauto's own communications have signaled that some model variants may eventually be restricted to paying users.[^13]

## See also

- [Suno](/wiki/suno)
- [Suno v5](/wiki/suno_v5)
- [Udio](/wiki/udio)
- [UMG v. Suno](/wiki/riaa_v_suno)
- [UMG v. Uncharted Labs (Udio)](/wiki/riaa_v_udio)
- [Stable Audio](/wiki/stable_audio)
- [Stable Audio 2.5](/wiki/stable_audio_2_5)
- [ElevenLabs Music](/wiki/elevenlabs_music)
- [MusicGen](/wiki/musicgen)
- [AudioCraft](/wiki/audiocraft)
- [Boomy](/wiki/boomy)
- [AI Music Generation](/wiki/ai_music_generation)
- [Diffusion model](/wiki/diffusion_model)
- [Latent diffusion model](/wiki/latent_diffusion)
- [Diffusion Transformer (DiT)](/wiki/diffusion_transformer)
- [Variational Autoencoder](/wiki/variational_autoencoder)
- [fal.ai](/wiki/fal_ai)
- [Stability AI](/wiki/stability_ai)

## References

[^1]: Y Combinator, "Sonauto: Create hit songs with AI", Y Combinator company directory, 2024. https://www.ycombinator.com/companies/sonauto. Accessed 2026-05-20.

[^2]: Y Combinator, "Launch YC: Sonauto - Make hit songs with AI", Y Combinator Launches, 2024. https://www.ycombinator.com/launches/Kb5-sonauto-make-hit-songs-with-ai. Accessed 2026-05-20.

[^3]: Ryan Tremblay ("zaptrem"), "Show HN: Sonauto - A more controllable AI music creator", Hacker News, 2024-04-10. https://news.ycombinator.com/item?id=39992817. Accessed 2026-05-20.

[^4]: Ryan Tremblay ("zaptrem"), "Show HN: Sonauto API - Generative music for developers", Hacker News, 2025-03-03. https://news.ycombinator.com/item?id=43244166. Accessed 2026-05-20.

[^5]: Sonauto, "Melodia API for Developers", sonauto.ai developer documentation, 2025. https://sonauto.ai/developers. Accessed 2026-05-20.

[^6]: Hayden Housen, "Personal site / Co-Founder, Sonauto", haydenhousen.com, 2024. https://haydenhousen.com/. Accessed 2026-05-20.

[^7]: Latka (Nathan Latka), "How Sonauto hit $330K revenue with a 3 person team in 2025", getlatka.com, 2025. https://getlatka.com/companies/sonauto.ai. Accessed 2026-05-20.

[^8]: fal.ai, "Sonauto Now Available on fal", blog.fal.ai, 2025-08-28. https://blog.fal.ai/sonauto-now-available-on-fal/. Accessed 2026-05-20.

[^9]: Hacker News, "Shoot Out Between the Udio, Sonauto and Suno AI Music Makers", Hacker News, 2024-04-10. https://news.ycombinator.com/item?id=39995985. Accessed 2026-05-20.

[^10]: Stuart Dredge, "Three more AI music startups: Soundry AI, Sonauto, SongSens.ai", Music Ally, 2024-04-04. https://musically.com/2024/04/04/three-more-ai-music-startups-soundry-ai-sonauto-songsens%C2%B7ai/. Accessed 2026-05-20.

[^11]: Eric Newcomer, "YC Startups Land Funding Ahead of Demo Day...LP Check Sizes Are Shrinking... Anthropic Raises Billions", Newcomer, 2024. https://www.newcomer.co/p/yc-startups-land-funding-ahead-of. Accessed 2026-05-20.

[^12]: Sonauto and Product Hunt, "Sonauto v2 Beta", Product Hunt forums, 2025-01-07. https://www.producthunt.com/p/sonauto/sonauto-v2-beta. Accessed 2026-05-20.

[^13]: ReviewNexa, "I Tested Sonauto AI for 30 Days: Is It Really Unlimited & Free?", ReviewNexa, 2025. https://reviewnexa.com/sonauto-review/. Accessed 2026-05-20.

[^14]: Skywork AI, "Sonauto Review 2025: Can This AI Music Generator Rival Suno & Udio?", Skywork AI blog, 2025. https://skywork.ai/blog/sonauto-review-2025/. Accessed 2026-05-20.

[^15]: IRCAM Amplify, "AI Music Detector Now Identifies Sonauto-Generated Tracks", IRCAM Amplify blog, 2025. https://www.ircamamplify.io/blog/ai-music-detector-now-detects-sonauto. Accessed 2026-05-20.

[^16]: Amanda Silberling, "AI music generator Suno hits 2M paid subscribers and $300M in annual recurring revenue", TechCrunch, 2026-02-27. https://techcrunch.com/2026/02/27/ai-music-generator-suno-hits-2-million-paid-subscribers-and-300m-in-annual-recurring-revenue/. Accessed 2026-05-20.

[^17]: RIAA, "Record Companies Bring Landmark Cases for Responsible AI Against Suno and Udio in Boston and New York Federal Courts, Respectively", RIAA.com news release, 2024-06-24. https://www.riaa.com/record-companies-bring-landmark-cases-for-responsible-ai-againstsuno-and-udio-in-boston-and-new-york-federal-courts-respectively/. Accessed 2026-05-20.

