Artwork

Machine Learning Street Talk (MLST)에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Machine Learning Street Talk (MLST) 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Prof. Jakob Foerster - ImageNet Moment for Reinforcement Learning?

53:31
 
공유
 

Manage episode 467295186 series 2803422
Machine Learning Street Talk (MLST)에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Machine Learning Street Talk (MLST) 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Prof. Jakob Foerster, a leading AI researcher at Oxford University and Meta, and Chris Lu, a researcher at OpenAI -- they explain how AI is moving beyond just mimicking human behaviour to creating truly intelligent agents that can learn and solve problems on their own. Foerster champions open-source AI for responsible, decentralised development. He addresses AI scaling, goal misalignment (Goodhart's Law), and the need for holistic alignment, offering a quick look at the future of AI and how to guide it.

SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/

***

TRANSCRIPT/REFS:

https://www.dropbox.com/scl/fi/yqjszhntfr00bhjh6t565/JAKOB.pdf?rlkey=scvny4bnwj8th42fjv8zsfu2y&dl=0

Prof. Jakob Foerster

https://x.com/j_foerst

https://www.jakobfoerster.com/

University of Oxford Profile:

https://eng.ox.ac.uk/people/jakob-foerster/

Chris Lu:

https://chrislu.page/

TOC

1. GPU Acceleration and Training Infrastructure

[00:00:00] 1.1 ARC Challenge Criticism and FLAIR Lab Overview

[00:01:25] 1.2 GPU Acceleration and Hardware Lottery in RL

[00:05:50] 1.3 Data Wall Challenges and Simulation-Based Solutions

[00:08:40] 1.4 JAX Implementation and Technical Acceleration

2. Learning Frameworks and Policy Optimization

[00:14:18] 2.1 Evolution of RL Algorithms and Mirror Learning Framework

[00:15:25] 2.2 Meta-Learning and Policy Optimization Algorithms

[00:21:47] 2.3 Language Models and Benchmark Challenges

[00:28:15] 2.4 Creativity and Meta-Learning in AI Systems

3. Multi-Agent Systems and Decentralization

[00:31:24] 3.1 Multi-Agent Systems and Emergent Intelligence

[00:38:35] 3.2 Swarm Intelligence vs Monolithic AGI Systems

[00:42:44] 3.3 Democratic Control and Decentralization of AI Development

[00:46:14] 3.4 Open Source AI and Alignment Challenges

[00:49:31] 3.5 Collaborative Models for AI Development

REFS

[[00:00:05] ARC Benchmark, Chollet

https://github.com/fchollet/ARC-AGI

[00:03:05] DRL Doesn't Work, Irpan

https://www.alexirpan.com/2018/02/14/rl-hard.html

[00:05:55] AI Training Data, Data Provenance Initiative

https://www.nytimes.com/2024/07/19/technology/ai-data-restrictions.html

[00:06:10] JaxMARL, Foerster et al.

https://arxiv.org/html/2311.10090v5

[00:08:50] M-FOS, Lu et al.

https://arxiv.org/abs/2205.01447

[00:09:45] JAX Library, Google Research

https://github.com/jax-ml/jax

[00:12:10] Kinetix, Mike and Michael

https://arxiv.org/abs/2410.23208

[00:12:45] Genie 2, DeepMind

https://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/

[00:14:42] Mirror Learning, Grudzien, Kuba et al.

https://arxiv.org/abs/2208.01682

[00:16:30] Discovered Policy Optimisation, Lu et al.

https://arxiv.org/abs/2210.05639

[00:24:10] Goodhart's Law, Goodhart

https://en.wikipedia.org/wiki/Goodhart%27s_law

[00:25:15] LLM ARChitect, Franzen et al.

https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf

[00:28:55] AlphaGo, Silver et al.

https://arxiv.org/pdf/1712.01815.pdf

[00:30:10] Meta-learning, Lu, Towers, Foerster

https://direct.mit.edu/isal/proceedings-pdf/isal2023/35/67/2354943/isal_a_00674.pdf

[00:31:30] Emergence of Pragmatics, Yuan et al.

https://arxiv.org/abs/2001.07752

[00:34:30] AI Safety, Amodei et al.

https://arxiv.org/abs/1606.06565

[00:35:45] Intentional Stance, Dennett

https://plato.stanford.edu/entries/ethics-ai/

[00:39:25] Multi-Agent RL, Zhou et al.

https://arxiv.org/pdf/2305.10091

[00:41:00] Open Source Generative AI, Foerster et al.

https://arxiv.org/abs/2405.08597

  continue reading

232 에피소드

Artwork
icon공유
 
Manage episode 467295186 series 2803422
Machine Learning Street Talk (MLST)에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Machine Learning Street Talk (MLST) 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Prof. Jakob Foerster, a leading AI researcher at Oxford University and Meta, and Chris Lu, a researcher at OpenAI -- they explain how AI is moving beyond just mimicking human behaviour to creating truly intelligent agents that can learn and solve problems on their own. Foerster champions open-source AI for responsible, decentralised development. He addresses AI scaling, goal misalignment (Goodhart's Law), and the need for holistic alignment, offering a quick look at the future of AI and how to guide it.

SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/

***

TRANSCRIPT/REFS:

https://www.dropbox.com/scl/fi/yqjszhntfr00bhjh6t565/JAKOB.pdf?rlkey=scvny4bnwj8th42fjv8zsfu2y&dl=0

Prof. Jakob Foerster

https://x.com/j_foerst

https://www.jakobfoerster.com/

University of Oxford Profile:

https://eng.ox.ac.uk/people/jakob-foerster/

Chris Lu:

https://chrislu.page/

TOC

1. GPU Acceleration and Training Infrastructure

[00:00:00] 1.1 ARC Challenge Criticism and FLAIR Lab Overview

[00:01:25] 1.2 GPU Acceleration and Hardware Lottery in RL

[00:05:50] 1.3 Data Wall Challenges and Simulation-Based Solutions

[00:08:40] 1.4 JAX Implementation and Technical Acceleration

2. Learning Frameworks and Policy Optimization

[00:14:18] 2.1 Evolution of RL Algorithms and Mirror Learning Framework

[00:15:25] 2.2 Meta-Learning and Policy Optimization Algorithms

[00:21:47] 2.3 Language Models and Benchmark Challenges

[00:28:15] 2.4 Creativity and Meta-Learning in AI Systems

3. Multi-Agent Systems and Decentralization

[00:31:24] 3.1 Multi-Agent Systems and Emergent Intelligence

[00:38:35] 3.2 Swarm Intelligence vs Monolithic AGI Systems

[00:42:44] 3.3 Democratic Control and Decentralization of AI Development

[00:46:14] 3.4 Open Source AI and Alignment Challenges

[00:49:31] 3.5 Collaborative Models for AI Development

REFS

[[00:00:05] ARC Benchmark, Chollet

https://github.com/fchollet/ARC-AGI

[00:03:05] DRL Doesn't Work, Irpan

https://www.alexirpan.com/2018/02/14/rl-hard.html

[00:05:55] AI Training Data, Data Provenance Initiative

https://www.nytimes.com/2024/07/19/technology/ai-data-restrictions.html

[00:06:10] JaxMARL, Foerster et al.

https://arxiv.org/html/2311.10090v5

[00:08:50] M-FOS, Lu et al.

https://arxiv.org/abs/2205.01447

[00:09:45] JAX Library, Google Research

https://github.com/jax-ml/jax

[00:12:10] Kinetix, Mike and Michael

https://arxiv.org/abs/2410.23208

[00:12:45] Genie 2, DeepMind

https://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/

[00:14:42] Mirror Learning, Grudzien, Kuba et al.

https://arxiv.org/abs/2208.01682

[00:16:30] Discovered Policy Optimisation, Lu et al.

https://arxiv.org/abs/2210.05639

[00:24:10] Goodhart's Law, Goodhart

https://en.wikipedia.org/wiki/Goodhart%27s_law

[00:25:15] LLM ARChitect, Franzen et al.

https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf

[00:28:55] AlphaGo, Silver et al.

https://arxiv.org/pdf/1712.01815.pdf

[00:30:10] Meta-learning, Lu, Towers, Foerster

https://direct.mit.edu/isal/proceedings-pdf/isal2023/35/67/2354943/isal_a_00674.pdf

[00:31:30] Emergence of Pragmatics, Yuan et al.

https://arxiv.org/abs/2001.07752

[00:34:30] AI Safety, Amodei et al.

https://arxiv.org/abs/1606.06565

[00:35:45] Intentional Stance, Dennett

https://plato.stanford.edu/entries/ethics-ai/

[00:39:25] Multi-Agent RL, Zhou et al.

https://arxiv.org/pdf/2305.10091

[00:41:00] Open Source Generative AI, Foerster et al.

https://arxiv.org/abs/2405.08597

  continue reading

232 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생