How Ziff Davis’s lawsuit against OpenAI redraws the battle lines with the media

In the early days of the current AI boom, The New York Times sued OpenAI and Microsoft for copyright infringement. It was a seismic move, but perhaps the most notable thing about it is what came after. In the subsequent months, publisher after publisher signed licensing deals with OpenAI, making their content available to ChatGPT. There were others who chose litigation, certainly, but most major media companies opted to take some money rather than spend it on lawyers.
That changed last week when Ziff Davis filed its own copyright lawsuit against OpenAI. Ziff owns several major online properties, including Mashable, CNET, IGN, and Lifehacker, and garners a massive amount of web traffic. According to the filing, its properties earned an average of 292 million monthly page views over the past year.
Strange, then, that OpenAI didn’t bother to negotiate with Ziff at all. The filing mentions that, after asking OpenAI to stop scraping its content without authorization, Ziff’s requests to negotiate were “rebuffed.” A news story about the lawsuit in PCMag (another Ziff property) also said OpenAI wouldn’t talk, though it’s unclear whether it was just repeating what the filing described.
While The Times and Ziff aren’t alone in their legal efforts against OpenAI, it’s informative to compare the two complaints, filed almost 16 months apart, to get an understanding of how the stakes of the AI-media cold war have evolved. AI technology has progressed considerably and we now have a much greater understanding of AI substitution risk—the fancy name for AI summarization of publisher content. Ziff’s lawsuit gives us a better idea of how the AI sausage is made these days and can tell us just how much other AI players should be sweating.
For starters, Ziff points to what has become table stakes in most of these lawsuits: the act of scraping content, storing copies of that content in a database, and then serving up either a “derivative work” (summaries) or the content itself as inherently violative of copyright. OpenAI has maintained, however clumsily at times, that its harvesting of content on the web to train models falls under fair use, a key exception to copyright law that has supported some instances of mass digital copying in the past.
That’s the central conflict to all these cases, but Ziff’s action goes in some novel directions that point to how things have changed since ChatGPT first arrived:
1. AI, meet DMCA
Ziff runs a few more yards with the copyright ball, claiming that OpenAI deliberately stripped copyright management information (CMI) from Ziff content. This is a bit of a technicality—essentially it means ChatGPT answers often don’t include bylines, the name of the publication, and other metadata that would identify the source. However, stripping out CMI from content and then distributing it under your own banner is a violation of the Digital Millennium Copyright Act (DMCA), giving the filing more teeth.
2. It’s a RAG world now
This is arguably the most important change between the two lawsuits and reflective of how the way we use AI to access information has changed. When The Times filed suit, ChatGPT wasn’t a proper search engine, and the public was only just beginning to understand retrieval-augmented generation, or RAG—broadly, how AI systems can go beyond their training data. RAG is an essential element of any AI-based search engine today, and it’s also massively increased the risk of AI substitution to publishers since a chatbot that can summarize current news is much more useful than one that only has access to archives that cut off after a certain date (remember that?).
3. Watering down the brands
Ziff frames the hallucination problem in a novel way, calling it “trademark dilution.” Media brands like Mashable and PCMag (both of which I used to work at) have built up their reputations over years or decades, the complaint makes the case that every time ChatGPT attributes a falsehood to one of them or wholesale imagines a fake review, it chips away at them. It’s a subtle point, but a compelling one that points to a future where valuable brands slowly become generic labels floating in the AI ether.
4. Paywalls are the first line of defense
Ziff says in the filing that its properties are particularly vulnerable to AI substitution because so little of its content is behind paywalls. Ziff’s business model is based primarily on advertising and commerce (mostly from readers clicking on affiliate links in articles), both of which depend on actual humans visiting websites and taking actions. If an AI summary negates that act, and there’s no licensing or subscription revenue to make up for it, that’s a huge hit to the business.
5. Changing robots.txt isn’t enough
Every website has a file that tells web scrapers what they can do with the content on that site. This “robots.txt” file allows sites to, say, let Google crawl their site but block AI training bots. Indeed, many sites do exactly that, but according to Ziff, it makes no difference. Despite explicitly blocking OpenAI’s GPTBot, Ziff still logged a spike in the bot’s activity on some of its sites. It’s generally assumed companies like OpenAI use third-party crawlers to scrape sites they’re not supposed to, but Ziff’s lawsuit accuses OpenAI of openly flouting the rules it claims to respect.
6. Regurgitation is still an issue
The original Times complaint spends many pages on the issue of “regurgitation”—when an AI system doesn’t just summarize a piece of content but instead repeats it, word for word. Generally this was thought to be a mostly solved issue, but Ziff’s filing claims it still happens, and that exact copies of articles are a relatively easy thing for ChatGPT users to call up. Apparently asking what the original text “might look like with three spaces after every period” is a method some have used to fool the chatbot into serving up exact copies of an article. (For the record, it didn’t work for me.)
The battle continues
Just when it was looking like licensing deals would be the new normal, Ziff Davis’s filing shows the fight between AI and news is far from over. How it plays out could end up being even more existential for a company like Ziff. However the court rules, the case confronts a more fundamental question: Can strong media brands that rely on commerce and free access coexist with AI systems that learn—and sometimes mislearn—from everything they touch?
What's Your Reaction?






