PDF to podcast: a new way to catch up on long-procrastinated reading
đUsing NotebookLM "Audio overviews" to learn about transformers
I wake up on a Sunday morning and I want to catch up on a few scientific papers. (Hey donât criticise my solitary weekend activities! đ)
A few years ago, there was basically one way to do it. You open the PDF and you start reading. You hit the first sentence that you donât understand. Then another. You try googling for an explanation. Your eyes glance over the text, you struggle to pay attention. You give up and go back to scrolling Instagram.
But thatâs how it used to be. Now the year is 2024 and we live in a science fiction movie.
AI-generated podcasts
Today we can upload our PDFs to NotebookLM. We can add any additional sources we like: papers, notes in Google Docs, links to websites, tweets.
The tool crunches through the sources and can answer questions about them, summarize the content, etc. (As we saw in an earlier post.) So far, so good!
But starting this month, NotebookLM can do one more thing. It can synthesize a podcast based on the uploaded materials! Itâs called âAudio overviewsâ and all it takes is one click of a button.
Letâs give it a try!
Catching up on transformers
Transformers are the famous neural network architecture powering the LLMs of today: ChatGPT, Gemini, etc. To learn more about this topic, I uploaded three sources to NotebookLM:
đ Attention is all you need [2017] - the paper that introduced transformers
đ Annotated transformer - an online post explaining the architecture
đŚ Andrej Karpathyâs insightful tweets
Now, letâs get our tailor-made audio overview:
Results
You can listen to the resulting podcast @ audio.com/pnote/audio/transformers :
When we hit âGenerateâ, NotebookLM generates a script of two virtual podcast hosts discussing the sources we uploaded, and then turns it into audio.
Host A: It seems like every other week there's some big AI breakthrough making headlines. Easy to get lost in all the hype, right?
Host B: It really is.
Host A: But today we're diving into a topic that actually does live up to the hype. We're talking transformers. (âŚ)
The conversational style of the recording is eerily realistic. Itâs not hard to forget itâs AI-generated. Some of the transitions are really neat, like this bit referencing Andrejâs Karpathy observations on what made transformers great:
Now, when Karpathy was talking about them, he specifically said that transformers are expressive, which I get, and efficient, which I also get. But he also said optimizable. What did he mean by that?
The discussion of the attention mechanism is pretty accurate at the high level:
Host A: When you read something like a news article or something, okay, you're not really paying attention to every single word. Are you?
Host B: No, definitely not.
Host A: Your brain's picking out the important stuff, the key facts, the big reveals. And that's kind of what this attention mechanism is doing in Transformers
On the other hand, the explanation of the role of residual connections (a key innovation that enables large neural networks to be efficiently trainable) is hand-wavy at best:
Host A: We don't just try to understand some complex concept all at once, right? It's step by step. (..) And transformers actually utilize the similar concept called residual connections, so they can break down these complicated patterns into smaller, more manageable chunks.
In fact, residual connections are not so much about breaking learning into steps, but about ensuring that the learning from earlier steps isnât lost as the network gets deeper.
Conclusion
Today I can wake up on a Sunday morning, upload some scientific writing to an online tool and get a bespoke podcast discussing the paper takeaways.
I can then listen to it while cooking or jogging or mopping the floor. What a way to learn!
The year is 2024 and we live in a science fiction movie đŤ.
Credits
Thanks to Aruna whose message inspired this post ! đĽł
Postcard from Paris
Morning view on the Montmartre hill and the Sacre Coeur.
Stay curious đŤ,
â Przemek