PDFs defeat AI?

I just heard a teaser for a story on how PDFs have become ubiquitous, with the supposed downside that AIs have a lot of trouble reading a PDF. The implication was that was bad, but I thought “Awesome! I’m going to have to switch to PDFs for more of my output! Oh, and I think I’ll start using TeX to produce more of that output!”

If you’ve ever read the contents of a PDF file produced by TeX you’ll understand.

Block of unreadable text at the beginning of a PDF file produced by TeX

Update: Turns out it was a story in the Economist. Here’s a gift link to the story (should get the first few people who click on it past the paywall):

https://www.economist.com/business/2026/02/24/the-war-against-pdfs-is-heating-up?giftId=OTNkOGVmNTgtN2ZmMi00NjAzLWExMmQtMDg0NjU5YzM1ZTY2&utm_campaign=gifted_article

And here’s the money quote:

The large language models underpinning generative AI are often bamboozled by PDFs, reading a page set in columns from left to right rather than top to bottom, say, or getting confused by headers and footers. Trouble parsing PDFs is one of the reasons AI chatbots occasionally “hallucinate”, generating nonsense.

I mean, for values of “money” that are totally confused about why LLMs hallucinate.

Philip Brewer

Writer: science fiction and fantasy, personal finance, and Esperanto

PDFs defeat AI?

Possibly related posts (auto-generated):

Leave a Reply