Artwork

Turpentine, Erik Torenberg, and Nathan Labenz에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Turpentine, Erik Torenberg, and Nathan Labenz 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Everything You Wanted to Know About LLM Post-Training, with Nathan Lambert of Allen Institute for AI

1:50:08
 
공유
 

Manage episode 451307537 series 3452589
Turpentine, Erik Torenberg, and Nathan Labenz에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Turpentine, Erik Torenberg, and Nathan Labenz 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

In this episode of The Cognitive Revolution, we dive deep into frontier post-training techniques for large language models with Nathan Lambert from the Allen Institute for AI. Nathan discusses the groundbreaking Tulu 3 release, which matches Meta's post-training performance using the LlAMA base model. We explore supervised fine-tuning, preference-based reinforcement learning, and the innovative reinforcement learning from verifiable reward technique. Nathan provides unprecedented insights into the practical aspects of model development, compute requirements, and data generation strategies. This technically rich conversation illuminates previously opaque aspects of LLM development, achieved by a small team of 10-15 people. Join us for one of our most detailed and valuable discussions on state-of-the-art AI model development.

Check out Nathan's Lambert newsletter:

https://www.natolambert.com

https://www.interconnects.ai

Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess

SPONSORS:

Incogni: Take your personal data back with Incogni! Use code REVOLUTION at the link below and get 60% off an annual plan: https://incogni.com/revolution

Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution

Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive

Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive

80,000 Hours: 80,000 Hours offers free one-on-one career advising for Cognitive Revolution listeners aiming to tackle global challenges, especially in AI. They connect high-potential individuals with experts, opportunities, and personalized career plans to maximize positive impact. Apply for a free call at https://80000hours.org/cognitiverevolution to accelerate your career and contribute to solving pressing AI-related issues.

RECOMMENDED PODCAST:

Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.

Apple: https://podcasts.apple.com/us/podcast/id1765716600

Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg

CHAPTERS:

(00:00:00) Teaser

(00:00:59) Sponsors: Incogni

(00:02:20) About the Episode

(00:05:56) Introducing AI2

(00:09:56) Tulu: Deep Dive (Part 1)

(00:17:43) Sponsors: Shopify | Oracle Cloud Infrastructure (OCI)

(00:20:38) Open vs. Closed Recipes

(00:29:48) Compute & Value (Part 1)

(00:34:22) Sponsors: 80,000 Hours | Notion

(00:37:02) Compute & Value (Part 2)

(00:42:41) Model Weight Evolution

(00:53:16) DPO vs. PPO

(01:06:36) Project Trajectory

(01:20:39) Synthetic Data & LLM Judge

(01:27:39) Verifiable RL

(01:38:17) Advice for Practitioners

(01:44:01) Open Source vs. Closed

(01:49:18) Outro

  continue reading

193 에피소드

Artwork
icon공유
 
Manage episode 451307537 series 3452589
Turpentine, Erik Torenberg, and Nathan Labenz에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Turpentine, Erik Torenberg, and Nathan Labenz 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

In this episode of The Cognitive Revolution, we dive deep into frontier post-training techniques for large language models with Nathan Lambert from the Allen Institute for AI. Nathan discusses the groundbreaking Tulu 3 release, which matches Meta's post-training performance using the LlAMA base model. We explore supervised fine-tuning, preference-based reinforcement learning, and the innovative reinforcement learning from verifiable reward technique. Nathan provides unprecedented insights into the practical aspects of model development, compute requirements, and data generation strategies. This technically rich conversation illuminates previously opaque aspects of LLM development, achieved by a small team of 10-15 people. Join us for one of our most detailed and valuable discussions on state-of-the-art AI model development.

Check out Nathan's Lambert newsletter:

https://www.natolambert.com

https://www.interconnects.ai

Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess

SPONSORS:

Incogni: Take your personal data back with Incogni! Use code REVOLUTION at the link below and get 60% off an annual plan: https://incogni.com/revolution

Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution

Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive

Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive

80,000 Hours: 80,000 Hours offers free one-on-one career advising for Cognitive Revolution listeners aiming to tackle global challenges, especially in AI. They connect high-potential individuals with experts, opportunities, and personalized career plans to maximize positive impact. Apply for a free call at https://80000hours.org/cognitiverevolution to accelerate your career and contribute to solving pressing AI-related issues.

RECOMMENDED PODCAST:

Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.

Apple: https://podcasts.apple.com/us/podcast/id1765716600

Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg

CHAPTERS:

(00:00:00) Teaser

(00:00:59) Sponsors: Incogni

(00:02:20) About the Episode

(00:05:56) Introducing AI2

(00:09:56) Tulu: Deep Dive (Part 1)

(00:17:43) Sponsors: Shopify | Oracle Cloud Infrastructure (OCI)

(00:20:38) Open vs. Closed Recipes

(00:29:48) Compute & Value (Part 1)

(00:34:22) Sponsors: 80,000 Hours | Notion

(00:37:02) Compute & Value (Part 2)

(00:42:41) Model Weight Evolution

(00:53:16) DPO vs. PPO

(01:06:36) Project Trajectory

(01:20:39) Synthetic Data & LLM Judge

(01:27:39) Verifiable RL

(01:38:17) Advice for Practitioners

(01:44:01) Open Source vs. Closed

(01:49:18) Outro

  continue reading

193 에피소드

Tutti gli episodi

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드