Artwork

Turpentine, Erik Torenberg, and Nathan Labenz에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Turpentine, Erik Torenberg, and Nathan Labenz 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Embryology of AI: How Training Data Shapes AI Development w/ Timaeus' Jesse Hoogland & Daniel Murfet

1:36:54
 
공유
 

Manage episode 489686571 series 3452589
Turpentine, Erik Torenberg, and Nathan Labenz에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Turpentine, Erik Torenberg, and Nathan Labenz 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Jesse Hoogland and Daniel Murfet, founders of Timaeus, introduce their mathematically rigorous approach to AI safety through "developmental interpretability" based on Singular Learning Theory. They explain how neural network loss landscapes are actually complex, jagged surfaces full of "singularities" where models can change internally without affecting external behavior—potentially masking dangerous misalignment. Using their Local Learning Coefficient measure, they've demonstrated the ability to identify critical phase changes during training in models up to 7 billion parameters, offering a complementary approach to mechanistic interpretability. This work aims to move beyond trial-and-error neural network training toward a more principled engineering discipline that could catch safety issues during training rather than after deployment.

Sponsors:

Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive

The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at ⁠https://agntcy.org

NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 41,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive

PRODUCED BY:

https://aipodcast.ing

CHAPTERS:

(00:00) About the Episode

(04:44) Introduction and Background

(06:17) Timaeus Origins and Philosophy

(09:13) Mathematical Background and SLT

(12:27) Developmental Interpretability Approach (Part 1)

(16:09) Sponsors: Oracle Cloud Infrastructure | The AGNTCY

(18:09) Developmental Interpretability Approach (Part 2)

(19:24) Proto-Paradigm and SAEs

(24:37) Understanding Generalization

(30:15) Central Dogma Framework (Part 1)

(32:13) Sponsor: NetSuite by Oracle

(33:37) Central Dogma Framework (Part 2)

(34:35) Loss Landscape Geometry

(40:41) Degeneracies and Evidence

(47:25) Structure and Data Connection

(55:36) Essential Dynamics and Algorithms

(01:00:53) Implicit Regularization and Complexity

(01:07:19) Double Descent and Scaling

(01:09:55) Big Picture Applications

(01:17:17) Reward Hacking and Risks

(01:25:19) Future Training Vision

(01:32:01) Scaling and Next Steps

(01:36:43) Outro

  continue reading

280 에피소드

Artwork
icon공유
 
Manage episode 489686571 series 3452589
Turpentine, Erik Torenberg, and Nathan Labenz에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Turpentine, Erik Torenberg, and Nathan Labenz 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Jesse Hoogland and Daniel Murfet, founders of Timaeus, introduce their mathematically rigorous approach to AI safety through "developmental interpretability" based on Singular Learning Theory. They explain how neural network loss landscapes are actually complex, jagged surfaces full of "singularities" where models can change internally without affecting external behavior—potentially masking dangerous misalignment. Using their Local Learning Coefficient measure, they've demonstrated the ability to identify critical phase changes during training in models up to 7 billion parameters, offering a complementary approach to mechanistic interpretability. This work aims to move beyond trial-and-error neural network training toward a more principled engineering discipline that could catch safety issues during training rather than after deployment.

Sponsors:

Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive

The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at ⁠https://agntcy.org

NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 41,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive

PRODUCED BY:

https://aipodcast.ing

CHAPTERS:

(00:00) About the Episode

(04:44) Introduction and Background

(06:17) Timaeus Origins and Philosophy

(09:13) Mathematical Background and SLT

(12:27) Developmental Interpretability Approach (Part 1)

(16:09) Sponsors: Oracle Cloud Infrastructure | The AGNTCY

(18:09) Developmental Interpretability Approach (Part 2)

(19:24) Proto-Paradigm and SAEs

(24:37) Understanding Generalization

(30:15) Central Dogma Framework (Part 1)

(32:13) Sponsor: NetSuite by Oracle

(33:37) Central Dogma Framework (Part 2)

(34:35) Loss Landscape Geometry

(40:41) Degeneracies and Evidence

(47:25) Structure and Data Connection

(55:36) Essential Dynamics and Algorithms

(01:00:53) Implicit Regularization and Complexity

(01:07:19) Double Descent and Scaling

(01:09:55) Big Picture Applications

(01:17:17) Reward Hacking and Risks

(01:25:19) Future Training Vision

(01:32:01) Scaling and Next Steps

(01:36:43) Outro

  continue reading

280 에피소드

Wszystkie odcinki

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생