Anton Teaches Packy AI | Ep 2 | Chinchilla

"Age of Miracles"

Packy McCormick and Packy McCormick | Turpentine에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Packy McCormick and Packy McCormick | Turpentine 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

3y ago 1:02:45

MP3•에피소드 홈

We're back! In Episode 2, Anton Teaches Packy about Deepmind's March 2022 paper, Training Compute-Optimal Large Language Models, or as it's more commonly known, Chinchilla. Prior to Chinchilla, the best way to improve the performance of LLMs was thought to be by scaling up the size of the model. As a result, the largest models now have over 500 billion parameters. But there are only so many GPUs in the world, and throwing compute at the problem is expensive and energy intensive. In this paper, Deepmind found that the optimal way to scale an LLM is actually by scaling size (parameters) and training (data) proportionally. Given the race for size, today's models are plenty big but need a lot more data.

In this conversation, we go deep on the paper itself, but we also zoom out to talk about the politics of AI, when AGI is going to hit, where to get more data, and why AI won't take our jobs. This one gets a lot more philosophical than our first episode as we explore the implications of Chinchilla and LLMs more generally. If you enjoyed this conversation, subscribe for more. We're going to try to release one episode per week, and we want to make this the best way to get a deeper understanding of the mind-blowing progress happening in AI and what it means for everything we do as humans.

LINKS: Training Compute-Optimal Large Language Models: https://arxiv.org/abs/2203.15556

chinchilla's wild implications: https://www.lesswrong.com/posts/6Fpvc...

Scaling Laws for Neural Language Models (Kaplan et al): https://arxiv.org/abs/2001.08361

--- Send in a voice message: https://podcasters.spotify.com/pod/show/ageofmiracles/message

222 에피소드