Artwork

Dev and Doc에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Dev and Doc 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

#14 Aligning AI models for healthcare | Understanding Reinforcement Learning from Human Feedback (RLHF)

42:01
 
공유
 

Manage episode 428686715 series 3585389
Dev and Doc에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Dev and Doc 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

How do we align AI models for healthcare? 👨‍⚕️ And importantly, the moral codes and ethics that we practice everyday, how does the LLM deal with ethical scenarios like the trolley problem for example? This is a fascinating topic and one we spend a lot of time thinking about. In this episode Dev and Doc, Zeljko Kraljevic and I cover all the up to date topics around reinforcement learning, the benefits and where it can go wrong. We also discuss different RL methods including the algorithms used to train ChatGPT (RLHF). Dev and Doc is a Podcast where developers and doctors join forces to deep dive into AI in healthcare. Together, we can build models that matter. 👨🏻‍⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua... 🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr The podcast 🎙️ 🔊Spotify: https://open.spotify.com/show/3QO5Lr3... 📙Substack: https://aiforhealthcare.substack.com/ Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :) 🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kral... 🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovic...00:00 Highlights 01:27 start 4:38 aligning ethics of ai models 7:04 doctors ethical choices daily 8:00 RLHF and AI training methods 16:29 reinforcement learning 19:35 Preference model -rewarding models correctly can make or break the success 27:05 exploiting reward function, model degradation (and how to fix it) Ref AI intro paper - https://pn.bmj.com/content/23/6/476 Open AI RLHF paper - https://arxiv.org/abs/1909.08593 War and peace of LLMs! - https://arxiv.org/abs/2311.17227

  continue reading

30 에피소드

Artwork
icon공유
 
Manage episode 428686715 series 3585389
Dev and Doc에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Dev and Doc 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

How do we align AI models for healthcare? 👨‍⚕️ And importantly, the moral codes and ethics that we practice everyday, how does the LLM deal with ethical scenarios like the trolley problem for example? This is a fascinating topic and one we spend a lot of time thinking about. In this episode Dev and Doc, Zeljko Kraljevic and I cover all the up to date topics around reinforcement learning, the benefits and where it can go wrong. We also discuss different RL methods including the algorithms used to train ChatGPT (RLHF). Dev and Doc is a Podcast where developers and doctors join forces to deep dive into AI in healthcare. Together, we can build models that matter. 👨🏻‍⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua... 🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr The podcast 🎙️ 🔊Spotify: https://open.spotify.com/show/3QO5Lr3... 📙Substack: https://aiforhealthcare.substack.com/ Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :) 🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kral... 🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovic...00:00 Highlights 01:27 start 4:38 aligning ethics of ai models 7:04 doctors ethical choices daily 8:00 RLHF and AI training methods 16:29 reinforcement learning 19:35 Preference model -rewarding models correctly can make or break the success 27:05 exploiting reward function, model degradation (and how to fix it) Ref AI intro paper - https://pn.bmj.com/content/23/6/476 Open AI RLHF paper - https://arxiv.org/abs/1909.08593 War and peace of LLMs! - https://arxiv.org/abs/2311.17227

  continue reading

30 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생