In a fiery Senate hearing on Wednesday, lawmakers, authors, and legal experts converged to scrutinize the AI industry’s controversial use of copyrighted works to train large language models. Titled “Too Big to Prosecute? Examining the AI Industry’s Mass Ingestion of Copyrighted Works for AI Training,” the session hosted by the Senate Judiciary Subcommittee on Crime and Counterterrorism spotlighted a brewing storm: whether tech giants have crossed ethical and legal lines in their race to dominate artificial intelligence.
At the center of the debate was the testimony of bestselling author David Baldacci, who voiced outrage over AI tools using pirated versions of his work to generate content in his voice—without consent or compensation. “You draw upon personal experiences, your own curiosities… a software platform really can’t replicate that,” Baldacci said. “And if it ever manages to do it, I’d like another planet to live on.”
Baldacci’s comments hit home for several senators, including committee chair Josh Hawley, who noted that even his books had been scraped by AI models. Attorney Maxwell Pritt, representing authors in an ongoing lawsuit against Meta, confirmed that at least 12 books by subcommittee members had been found in AI training datasets.
According to Pritt, AI companies have used data from pirate sites to feed their models—then warned internally that public exposure could damage relations with lawmakers and regulators. “A Meta researcher explicitly noted that media coverage of pirated training data could jeopardize negotiations with the very people in this room,” Pritt revealed.
Baldacci stressed that he has always been open to licensing his work—but no one from Big Tech asked. “If someone comes to me and wants to license my work, I will listen to them,” he said. “We can negotiate something that’s agreeable to both parties.”
But it’s not so simple, said marketing and IT professor Michael Smith. “When you sign a license with a generative AI company, you’re signing with a gun to your head,” he said. “They can say, ‘Either take this offer or we’ll steal your work instead.’”
Not everyone agreed that the AI industry is overstepping. Law professor Edward Lee defended the use of copyrighted books for training AI, calling it transformative and citing national interests in staying ahead of China in the AI arms race. “It may benefit American corporations,” Lee argued.
Hawley wasn’t having it. “So the mass theft and potential impoverishment of American citizens benefits corporations—and that’s supposed to be good for America?” he fired back.
Professor Bhamati Viswanathan underscored the misuse of the fair use doctrine. “Fair use is an affirmative defense,” she said. “Saying, ‘Yes, I infringed, but I did it for a good reason.’ Creating a trillion-dollar company doesn’t count.”
She added that AI companies using torrent networks aren’t just receiving pirated content—they’re redistributing it too, compounding the legal issues.
In his closing remarks, Hawley echoed Baldacci’s frustration: “If this isn’t infringement, Congress needs to do something… If we don’t act, a few mega corporations will vacuum up the genius of our people and pay no one for it. That’s not America.”
As public scrutiny intensifies and lawsuits mount, Wednesday’s hearing made one thing clear: the AI industry’s unchecked consumption of creative work may finally have met its reckoning on Capitol Hill.