최고 Reinforcement Learning 팟캐스트 (2025)

1
Danijar Hafner on Dreamer v4 1:40:52

9d ago1:40:52

1:40:52

Danijar Hafner was a Research Scientist at Google DeepMind until recently. Featured References Training Agents Inside of Scalable World Models [ blog ] Danijar Hafner, Wilson Yan, Timothy Lillicrap One Step Diffusion via Shortcut Models Kevin Frans, Danijar Hafner, Sergey Levine, Pieter Abbeel Action and Perception as Divergence Minimization [ blog…

1
David Abel on the Science of Agency @ RLDM 2025 59:42

2M ago59:42

59:42

David Abel is a Senior Research Scientist at DeepMind on the Agency team, and an Honorary Fellow at the University of Edinburgh. His research blends computer science and philosophy, exploring foundational questions about reinforcement learning, definitions, and the nature of agency. Featured References Plasticity as the Mirror of Empowerment David …

1
Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025 12:20

3M ago12:20

12:20

Recorded at Reinforcement Learning Conference 2025 at University of Alberta, Edmonton Alberta Canada. Featured References Lecture on the Oak Architecture, Rich Sutton Alberta Plan, Rich Sutton with Mike Bowling and Patrick Pilarski Additional References Jacob Beck on Google Scholar Alex Goldie on Google Scholar Cornelius Braun on Google Scholar Rei…

1
Outstanding Paper Award Winners - 2/2 @ RLC 2025 14:18

3M ago14:18

14:18

We caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025. Featured References Empirical Reinforcement Learning Research Mitigating Suboptimality of Deterministic Policy Gradients in Comple…

1
Outstanding Paper Award Winners - 1/2 @ RLC 2025 6:46

3M ago6:46

6:46

We caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025. Featured References Scientific Understanding in Reinforcement Learning How Should We Meta-Learn Reinforcement Learning Algorithms?…

1
Thomas Akam on Model-based RL in the Brain 52:06

4M ago52:06

52:06

Prof Thomas Akam is a Neuroscientist at the Oxford University Department of Experimental Psychology. He is a Wellcome Career Development Fellow and Associate Professor at the University of Oxford, and leads the Cognitive Circuits research group. Featured References Brain Architecture for Adaptive Behaviour Thomas Akam, RLDM 2025 Tutorial Additional…

1
Stefano Albrecht on Multi-Agent RL @ RLDM 2025 31:34

4M ago31:34

31:34

Stefano V. Albrecht was previously Associate Professor at the University of Edinburgh, and is currently serving as Director of AI at startup Deepflow. He is a Program Chair of RLDM 2025 and is co-author of the MIT Press textbook "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches". Featured References Multi-Agent Reinforcement Le…

1
Satinder Singh: The Origin Story of RLDM @ RLDM 2025 5:57

5M ago5:57

5:57

Professor Satinder Singh of Google DeepMind and U of Michigan is co-founder of RLDM. Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference). Recorded on location at Trinity College Dublin, Ireland during RLDM 2025. Featured References RLDM 2025: Multi-disciplinary Conference on Reinforcement Lea…

1
NeurIPS 2024 - Posters and Hallways 3 10:01

8M ago10:01

10:01

Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Claire Bizon Monroc from Inria: WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control Andrew Wagenmaker from UC Berkeley: Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore f…

1
NeurIPS 2024 - Posters and Hallways 2 8:48

9M ago8:48

8:48

Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Jonathan Cook from University of Oxford: Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning Yifei Zhou from Berkeley AI Research: DigiRL: Training In-The-Wild Device-Control Agents wit…

1
NeurIPS 2024 - Posters and Hallways 1 9:32

9M ago9:32

9:32

Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Jiaheng Hu of University of Texas: Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning Skander Moalla of EPFL: No Representation, No Trust: Connecting Representation, Collapse, an…

1
Abhishek Naik on Continuing RL & Average Reward 1:21:40

9M ago1:21:40

1:21:40

Abhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton. Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications. Featured References Reinforcement Learning for Cont…

1
Neurips 2024 RL meetup Hot takes: What sucks about RL? 17:45

11M ago17:45

17:45

What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out! Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024. Special thanks to "David Beckham" for the inspiration :)저자 Robin Ranjit Singh Chauhan

1
RLC 2024 - Posters and Hallways 5 13:17

1y ago13:17

13:17

Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports 0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward 2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-…

1
RLC 2024 - Posters and Hallways 4 4:52

1y ago4:52

4:52

Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 David Abel from DeepMind on 3 Dogmas of RL 0:55 Kevin Wang from Brown on learning variable depth search for MCTS 2:17 Ashwin Kumar from Washington University in St Louis on fairness in resource allocation 3:36 Prabhat Nagaraja…

1
RLC 2024 - Posters and Hallways 3 6:43

1y ago6:43

6:43

Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Kris De Asis from Openmind on Time Discretization 2:23 Anna Hakhverdyan from U of Alberta on Online Hyperparameters 3:59 Dilip Arumugam from Princeton on Information Theory and Exploration 5:04 Micah Carroll from UC Berkeley o…

1
RLC 2024 - Posters and Hallways 2 15:52

1y ago15:52

15:52

Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Hector Kohler from Centre Inria de l'Université de Lille with "Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning" 2:29 Quentin Delfosse from TU Darmstadt on "Interpretable Concept Bottlenecks to …

1
RLC 2024 - Posters and Hallways 1 5:46

1y ago5:46

5:46

Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Ann Huang from Harvard on Learning Dynamics and the Geometry of Neural Dynamics in Recurrent Neural Controllers 1:37 Jannis Blüml from TU Darmstadt on HackAtari: Atari Learning Environments for Robust and Continual Reinforceme…

1
Finale Doshi-Velez on RL for Healthcare @ RCL 2024 7:35

1y ago7:35

7:35

Finale Doshi-Velez is a Professor at the Harvard Paulson School of Engineering and Applied Sciences. This off-the-cuff interview was recorded at UMass Amherst during the workshop day of RL Conference on August 9th 2024. Host notes: I've been a fan of some of Prof Doshi-Velez' past work on clinical RL and hoped to feature her for some time now, so I…

1
David Silver 2 - Discussion after Keynote @ RCL 2024 16:17

1y ago16:17

16:17

Thanks to Professor Silver for permission to record this discussion after his RLC 2024 keynote lecture. Recorded at UMass Amherst during RCL 2024. Due to the live recording environment, audio quality varies. We publish this audio in its raw form to preserve the authenticity and immediacy of the discussion. References AlphaProof announcement on Deep…

1
David Silver @ RCL 2024 11:27

1y ago11:27

11:27

David Silver is a principal research scientist at DeepMind and a professor at University College London. This interview was recorded at UMass Amherst during RLC 2024. References Discovering Reinforcement Learning Algorithms, Oh et al -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published Mastering Chess and Shog…

1
Vincent Moens on TorchRL 40:14

1+ y ago40:14

40:14

Dr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch. Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens Additional Refe…

1
Arash Ahmadian on Rethinking RLHF 33:30

1+ y ago33:30

33:30

Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI. Featured Reference Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, J…

1
Glen Berseth on RL Conference 21:38

1+ y ago21:38

21:38

Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL). Featured Links Reinforcement Learning Conference Closing the Gap between TD Learning and Supervised Learning…

1
Ian Osband 1:08:26

1+ y ago1:08:26

1:08:26

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty. We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks and scaling to LLMs Featured References Reinforcement Learning, Bit by Bit Xiuyuan Lu, Benjamin Van Roy,…

1
Sharath Chandra Raparthy 40:41

2y ago40:41

40:41

Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more! Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila. Featured Reference Generalization to New Sequential Decision Making Tasks with In-Context Learning Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael …

1
Pierluca D'Oro and Martin Klissarov 57:24

2y ago57:24

57:24

Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more! Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta. Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta. Featured References Motif: Intrinsic Motivation from Artificial Intelligence F…

1
Martin Riedmiller 1:13:56

2y ago1:13:56

1:13:56

Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more! Martin Riedmiller is a research scientist and team lead at DeepMind. Featured References Magnetic control of tokamak plasmas through deep r…

1
Max Schwarzer 1:10:18

2+ y ago1:10:18

1:10:18

Max Schwarzer is a PhD student at Mila, with Aaron Courville and Marc Bellemare, interested in RL scaling, representation learning for RL, and RL for science. Max spent the last 1.5 years at Google Brain/DeepMind, and is now at Apple Machine Learning Research. Featured References Bigger, Better, Faster: Human-level Atari with human-level efficiency…

1
Julian Togelius 40:04

2+ y ago40:04

40:04

Julian Togelius is an Associate Professor of Computer Science and Engineering at NYU, and Cofounder and research director at modl.ai Featured References Choose Your Weapon: Survival Strategies for Depressed AI Academics Julian Togelius, Georgios N. Yannakakis Learning Controllable 3D Level Generators Zehua Jiang, Sam Earle, Michael Cerny Green, Jul…

1
Jakob Foerster 1:03:45

2+ y ago1:03:45

1:03:45

Jakob Foerster on Multi-Agent learning, Cooperation vs Competition, Emergent Communication, Zero-shot coordination, Opponent Shaping, agents for Hanabi and Prisoner's Dilemma, and more. Jakob Foerster is an Associate Professor at University of Oxford. Featured References Learning with Opponent-Learning Awareness Jakob N. Foerster, Richard Y. Chen, …

1
Danijar Hafner 2 45:15

2+ y ago45:15

45:15

Danijar Hafner on the DreamerV3 agent and world models, the Director agent and heirarchical RL, realtime RL on robots with DayDreamer, and his framework for unsupervised agent design! Danijar Hafner is a PhD candidate at the University of Toronto with Jimmy Ba, a visiting student at UC Berkeley with Pieter Abbeel, and an intern at DeepMind. He has …

1
Jeff Clune 1:11:11

2+ y ago1:11:11

1:11:11

AI Generating Algos, Learning to play Minecraft with Video PreTraining (VPT), Go-Explore for hard exploration, POET and Open Endedness, AI-GAs and ChatGPT, AGI predictions, and lots more! Professor Jeff Clune is Associate Professor of Computer Science at University of British Columbia, a Canada CIFAR AI Chair and Faculty Member at Vector Institute,…

1
Natasha Jaques 2 46:02

2+ y ago46:02

46:02

Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more! Dr Natasha Jaques is a Senior Research Scientist at Google Brain. Featured References Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog Natasha Jaques, As…

1
Jacob Beck and Risto Vuorio 1:07:05

2+ y ago1:07:05

1:07:05

Jacob Beck and Risto Vuorio on their recent Survey of Meta-Reinforcement Learning. Jacob and Risto are Ph.D. students at Whiteson Research Lab at University of Oxford. Featured Reference A Survey of Meta-Reinforcement Learning Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson Additional References…

1
John Schulman 44:21

3y ago44:21

44:21

John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI. Featured References WebGPT: Browser-assisted question-answering with human feedback Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, …

1
Sven Mika 34:56

3+ y ago34:56

34:56

Sven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University. Featured References RLlib Documentation: RLlib: Industry-Grade Reinforcement Learning Ray: Documentation RLlib: Abstractions for Distributed Reinforc…

1
Karol Hausman and Fei Xia 1:03:09

3+ y ago1:03:09

1:03:09

Karol Hausman is a Senior Research Scientist at Google Brain and an Adjunct Professor at Stanford working on robotics and machine learning. Karol is interested in enabling robots to acquire general-purpose skills with minimal supervision in real-world environments. Fei Xia is a Research Scientist with Google Research. Fei Xia is mostly interested i…

1
Sai Krishna Gottipati 1:08:11

3+ y ago1:08:11

1:08:11

Saikrishna Gottipati is an RL Researcher at AI Redefined, working on RL, MARL, human in the loop learning. Featured References Cogment: Open Source Framework For Distributed Multi-actor Training, Deployment & Operations AI Redefined, Sai Krishna Gottipati, Sagar Kurandwad, Clodéric Mars, Gregory Szriftgiser, François Chabot Do As You Teach: A Multi…

1
Aravind Srinivas 2 58:33

3+ y ago58:33

58:33

Aravind Srinivas is back! He is now a research Scientist at OpenAI. Featured References Decision Transformer: Reinforcement Learning via Sequence Modeling Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch VideoGPT: Video Generation using VQ-VAE and Transformers Wilson Y…

1
Rohin Shah 1:37:04

3+ y ago1:37:04

1:37:04

Dr. Rohin Shah is a Research Scientist at DeepMind, and the editor and main contributor of the Alignment Newsletter. Featured References The MineRL BASALT Competition on Learning from Human Feedback Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, …

1
Robert Lange 1:10:57

4y ago1:10:57

1:10:57

Robert Tjarko Lange is a PhD student working at the Technical University Berlin. Featured References Learning not to learn: Nature versus nurture in silico Lange, R. T., & Sprekeler, H. (2020) On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning Vischer, M. A., Lange, R. T., & Sprekeler, H. (2021). Semantic RL with Act…

1
NeurIPS 2021 Political Economy of Reinforcement Learning Systems (PERLS) Workshop 24:07

4y ago24:07

24:07

We hear about the idea of PERLS and why its important to talk about. Political Economy of Reinforcement Learning (PERLS) Workshop at NeurIPS 2021 on Tues Dec 14th NeurIPS 2021저자 Robin Ranjit Singh Chauhan

1
Amy Zhang 1:09:35

4y ago1:09:35

1:09:35

Amy Zhang is a postdoctoral scholar at UC Berkeley and a research scientist at Facebook AI Research. She will be starting as an assistant professor at UT Austin in Spring 2023. Featured References Invariant Causal Prediction for Block MDPs Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precu…

1
Xianyuan Zhan 41:30

4y ago41:30

41:30

Xianyuan Zhan is currently a research assistant professor at the Institute for AI Industry Research (AIR), Tsinghua University. He received his Ph.D. degree at Purdue University. Before joining Tsinghua University, Dr. Zhan worked as a researcher at Microsoft Research Asia (MSRA) and a data scientist at JD Technology. At JD Technology, he led the r…

1
Eugene Vinitsky 1:06:02

4+ y ago1:06:02

1:06:02

Eugene Vinitsky is a PhD student at UC Berkeley advised by Alexandre Bayen. He has interned at Tesla and Deepmind. Featured References A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings Eugene Vinitsky, Raphael Köster, John P. Agapiou, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, Joel Z. Leib…

1
Jess Whittlestone 1:31:36

4+ y ago1:31:36

1:31:36

Dr. Jess Whittlestone is a Senior Research Fellow at the Centre for the Study of Existential Risk and the Leverhulme Centre for the Future of Intelligence, both at the University of Cambridge. Featured References The Societal Implications of Deep Reinforcement Learning Jess Whittlestone, Kai Arulkumaran, Matthew Crosby Artificial Canaries: Early Wa…

1
Aleksandra Faust 54:30

4+ y ago54:30

54:30

Dr Aleksandra Faust is a Staff Research Scientist and Reinforcement Learning research team co-founder at Google Brain Research. Featured References Reinforcement Learning and Planning for Preference Balancing Tasks Faust 2014 Learning Navigation Behaviors End-to-End with AutoRL Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis E…

1
Sam Ritter 1:40:35

4+ y ago1:40:35

1:40:35

Sam Ritter is a Research Scientist on the neuroscience team at DeepMind. Featured References Unsupervised Predictive Memory in a Goal-Directed Agent (MERLIN) Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harle…

1
Thomas Krendl Gilbert 1:12:14

4+ y ago1:12:14

1:12:14

Thomas Krendl Gilbert is a PhD student at UC Berkeley’s Center for Human-Compatible AI, specializing in Machine Ethics and Epistemology. Featured References Hard Choices in Artificial Intelligence: Addressing Normative Uncertainty through Sociotechnical Commitments Roel Dobbe, Thomas Krendl Gilbert, Yonatan Mintz Mapping the Political Economy of Re…

들어볼 가치가 있는 팟캐스트

Reinforcement Learning 팟 캐스트

들어볼 가치가 있는 팟캐스트

빠른 참조 가이드