The Copyright Reckoning: How AI Music Startups Face Massive Legal Battles Over Training Data
Record labels have discovered that AI music generators trained on millions of copyrighted recordings without permission, and they're now dramatically expanding their legal cases to reflect the true scope of the infringement. Universal Music Group and Sony Music Entertainment filed a motion to add more than 61,000 copyrighted sound recordings to their lawsuit against Suno after using audio fingerprinting technology to identify their works within the startup's training data. Meanwhile, Sony separately moved to add over 30,000 copyrighted recordings to its parallel case against rival AI music generator Udio.
What Exactly Did These AI Companies Use to Train Their Models?
The discovery process has revealed the staggering scale of how AI music generators built their systems. In the Suno case, the original complaint filed in June 2024 asserted just 560 copyrighted works. Now, after gaining access to Suno's training data, the labels say they identified "millions" of their recordings being used, though they're asserting "only a small fraction" of 61,026 works in their expanded complaint. The labels were "forced to undertake a costly and burdensome review" because Suno "declined to offer any transparency outside of the discovery process," according to their filing.
To identify which recordings were used, the plaintiffs employed Audible Magic, an industry-standard audio fingerprinting technology. The process took months. In the first stage, the labels' experts created digital fingerprints of each audio file in Suno's training data, requiring two full weeks of work across two visits to a secure room at Suno's outside counsel's office. Those fingerprints were then checked against Audible Magic's content-recognition database for matches.
The discovery disputes themselves reveal how contentious this process became. Suno initially agreed to permit the first stage of analysis in June 2025, but then rescinded its consent on July 8, 2025, citing security concerns. A magistrate judge eventually brokered a compromise in October 2025, and the analysis was completed on January 2, 2026, with final results delivered on January 15, 2026.
How Are the Record Labels Building Their Cases?
- Audio Fingerprinting Analysis: Labels used Audible Magic technology to scan Suno's training data and identify matches against their copyrighted works, a process that took months and required multiple court interventions to complete.
- Manual Verification Process: For Sony, the process involved identifying works believed to be registered with the Copyright Office, compiling registration certificates, manually looking up ISRC numbers, and comparing those against fingerprinting results.
- Artist and Rights Confirmation: For Universal, the process included creating lists of works asserted in prior litigation, manually searching for and confirming rights ownership, and conferring with artist representatives to verify claims.
- Strategic Subset Selection: The labels acknowledge they "could have devoted additional time to reviewing the complete set" of fingerprinting results but "concentrated on a representative subset" to advance the litigation.
In the Udio case, Sony's motion to add 30,442 copyrighted works follows a similar pattern. The company identified these additional works "after gaining access to Udio's training data during the discovery process," according to the filing. Sony argues the court should grant the amendment because the plaintiffs "were diligent in identifying the additional works" and "there is no question that Plaintiffs have a viable claim for direct copyright infringement".
Sony
Why Are Some Labels Settling While Others Fight?
The legal landscape has shifted dramatically. Universal Music Group and Warner Music Group, originally co-plaintiffs in both cases, have already settled their disputes. Universal reached a licensing deal with Udio in October 2025, and Warner followed suit in November 2025. Warner also settled with Suno and struck a licensing deal, with the company voluntarily dismissed from that case in January 2026.
Sony and Universal remain active plaintiffs in the Suno case, but their licensing negotiations with the startup have reportedly stalled. The two labels have fought to obtain the terms of Suno's settlement with Warner Music, arguing that the deal represents a "forward-looking commercial arrangement" rather than merely a backward-looking settlement. Suno has accused the labels of attempting to "relitigate a dispute they lost".
Suno opposes the motion to expand the lawsuit, arguing that the amendment "would effectively start the case over" and that it is "entitled to an expeditious resolution of its fair use defense". The labels counter that "denying leave to amend on that ground would effectively reward Suno for copying copyrighted works on an unprecedented scale and then hiding that copying from public view".
What Legal Arguments Are Both Sides Making?
The core dispute centers on whether AI music generators' use of copyrighted recordings for training falls under "fair use," a legal exemption that permits limited use of copyrighted material without permission. In August 2024, both Udio and Suno largely admitted they had used copyrighted recordings to train their models but argued their use qualified as fair use. The labels amended their complaints in September 2025 to add a Digital Millennium Copyright Act (DMCA) claim, alleging that Udio circumvented YouTube's technological protections to collect copyrighted recordings for training data.
In April 2026, a federal judge denied Udio's motion to dismiss the DMCA claim, finding that the plaintiffs had "plausibly allege[d] that YouTube employs technological measures that regulate access to its content and that Defendant circumvented them". Later that month, Udio admitted it "obtained audio data from YouTube for use as training data" and had used YT-DLP, a stream-ripping tool, to acquire some of that data.
Udio maintains its fair use defense, calling its AI tool's back-end process "quintessential fair use," and has accused Sony of "anticompetitive activities that extend an unlawful monopoly over the production and commercialization of music". The case is scheduled for summary judgment motions by January 8, 2027, though that deadline may shift if the court grants the motion to expand the complaint.
Udio
What Does This Mean for the AI Music Industry?
The expanding lawsuits underscore a fundamental tension in the AI music space. While some companies like ElevenLabs have emphasized building their music-generation models on "licensed data and cleared for commercial use," allowing users to freely use generated tracks, others have faced significant legal challenges. The scale of the copyright infringement allegations, with tens of thousands of works identified in discovery, suggests the industry's training practices have been far more extensive than initially disclosed.
The settlements with Warner and Universal suggest a path forward through licensing agreements, but the ongoing disputes with Sony and Universal over Suno indicate that not all labels are willing to accept the same terms. The outcome of these cases could reshape how AI music companies source their training data and whether they can operate profitably under licensing arrangements rather than relying on fair use arguments.