“Searching for Cyclic TV Reference Paradoxes”, Jamie Pinheiro2022-06-05 (, , )⁠:

[code, interactive application] A while back I was watching Brooklyn Nine-Nine and came across this particular line—“I watch a lot of Grey’s Anatomy.” At first glance, there isn’t anything too special about this. But after some more thought, I realized there was a deeper implication to this singular line—in the fictional universe that Brooklyn Nine-Nine takes place in, Grey’s Anatomy exists as a TV show. More and more, I started noticing these sorts of fictional references across many TV shows. Eventually, this got me wondering what would happen if a bunch of references like these formed a cycle? Such as if Grey’s Anatomy also referenced Brooklyn Nine-Nine. If that were to happen, both shows would be relying on the other being fictional in their respective universes, which cannot both be true simultaneously—a paradox!…I’ve decided to call them Cyclic TV Reference Paradoxes.

…The next best option was to look at the subtitles for TV shows. By searching the dialog for names of other TV shows, direct references could be found. This would miss out on indirect references (ie. mentioning “Baby Yoda” to reference The Mandalorian), but they weren’t entirely required for this project’s needs.

The Technicals: The first step was then getting all the required subtitles. Easier said than done. Subtitles were scattered across many different services and platforms. Luckily, I stumbled across Subliminal—a Python library that abstracts this complexity away and fetches subtitles across multiple subtitles providers. Pairing this with TMDB’s API—a great source on TV show metadata (names, popularity, episode/season counts), I managed to download over 40,000 subtitle files over a couple of days.

Great. However, this amounted to over 5 million lines of subtitles. To do anything meaningful with this amount of data, it needed to be indexed in some way. I ingested all the subtitle data into Whoosh—a Python text-searching library, allowing for quick, fuzzy searches to be done to find subtitles. From there, I could quickly search the names of different TV shows to find references to them within other shows’ dialog.

…The web app runs a depth-first search to find any cycles, and was able to find a total of 72 Cyclic TV Reference Paradoxes!

Some were small—just two TV shows referencing each other. Others were much larger and more intricate—being made up of many TV shows (the largest being made up of 12).

All in all, it was quite cool finding these cycles and proving the existence of these paradoxes. Here’s a set of screencaps illustrating another one of these paradoxes.

The O.C. → The Simpsons → Two and a Half Men → The Big Bang Theory → The O.C.

Other Findings: There were some other interesting findings this tool could illustrate well. The most referenced show was unsurprisingly Star Trek. It was referenced 239× across 45 shows. Interestingly, there weren’t any outwards references from Star Trek due to it being an older show. The show that references the most other shows was The Simpsons, referencing 44 shows a total of 127×. The Simpsons was actually a part of every cycle found, which makes sense given how frequently it both references other shows, and is referenced.

Looking through more shows, a pattern formed where more comedic shows (ie. The Big Bang Theory, Family Guy…) would reference other shows more often. Conversely, more serious shows (ie. Grey’s Anatomy, Game of Thrones…) would be referenced frequently, yet never reference other shows. It appears relevancy—through the use of external references, help shows come off as funny.