Artwork

Dr. Andrew Clark & Sid Mangalik, Dr. Andrew Clark, and Sid Mangalik에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Dr. Andrew Clark & Sid Mangalik, Dr. Andrew Clark, and Sid Mangalik 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Information theory and the complexities of AI model monitoring

21:56
 
공유
 

Manage episode 408886093 series 3475282
Dr. Andrew Clark & Sid Mangalik, Dr. Andrew Clark, and Sid Mangalik에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Dr. Andrew Clark & Sid Mangalik, Dr. Andrew Clark, and Sid Mangalik 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

In this episode, we explore information theory and the not-so-obvious shortcomings of its popular metrics for model monitoring; and where non-parametric statistical methods can serve as the better option.

Introduction and latest news 0:03

Information theory and its applications in AI. 3:45

  • The importance of information theory in computer science, citing its applications in cryptography and communication.
  • The basics of information theory, including the concept of entropy, which measures the uncertainty of a random variable.
  • Information theory as a fundamental discipline in computer science, and how it has been applied in recent years, particularly in the field of machine learning.
  • The speakers clarify the difference between a metric and a divergence, which is crucial to understanding how information theory is being misapplied in some cases

Information theory metrics and their limitations. 7:05

  • Divergences are a type of measurement that don't follow simple rules like distance, and they have some nice properties but can be troublesome in certain use cases.
  • KL Divergence is a popular test for monitoring changes in data distributions, but it's not symmetric and can lead to incorrect comparisons.
  • Sid explains that KL divergence measures the slight surprisal or entropy difference between moving from one data distribution to another, and is not the same as KS test.

Metrics for monitoring AI model changes. 10:41

  • The limitations of KL divergence and its alternatives, including Jenson Shannon divergence and population stability index.
  • They highlight the issues with KL divergence, such as asymmetry and handling of zeros, and the advantages of Jenson Shannon divergence, which can handle both issues, and population stability index, which provides a quantitative measure of changes in model distributions.
  • The popularity of information theory metrics in AI and ML is largely due to legacy and a lack of understanding of the underlying concepts.
  • Information theory metrics may not be the best choice for quantifying change in risk in the AI and ML space, but they are the ones that are commonly used due to familiarity and ease of use.

Using nonparametric statistics in modeling systems. 15:09

  • Information theory divergences are not useful for monitoring production model performance, according to the speakers.
  • Andrew Clark highlights the advantages of using nonparametric statistics in machine learning, including distribution agnosticism and the ability to test for significance without knowing the underlying distribution.
  • Sid Mangalik and Andrew Clark recommend using nonparametric tests such as the KS test and chi-square test to supplement divergences and provide m

Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

  • LinkedIn - Episode summaries, shares of cited articles, and more.
  • YouTube - Was it something that we said? Good. Share your favorite quotes.
  • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
  continue reading

17 에피소드

Artwork
icon공유
 
Manage episode 408886093 series 3475282
Dr. Andrew Clark & Sid Mangalik, Dr. Andrew Clark, and Sid Mangalik에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Dr. Andrew Clark & Sid Mangalik, Dr. Andrew Clark, and Sid Mangalik 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

In this episode, we explore information theory and the not-so-obvious shortcomings of its popular metrics for model monitoring; and where non-parametric statistical methods can serve as the better option.

Introduction and latest news 0:03

Information theory and its applications in AI. 3:45

  • The importance of information theory in computer science, citing its applications in cryptography and communication.
  • The basics of information theory, including the concept of entropy, which measures the uncertainty of a random variable.
  • Information theory as a fundamental discipline in computer science, and how it has been applied in recent years, particularly in the field of machine learning.
  • The speakers clarify the difference between a metric and a divergence, which is crucial to understanding how information theory is being misapplied in some cases

Information theory metrics and their limitations. 7:05

  • Divergences are a type of measurement that don't follow simple rules like distance, and they have some nice properties but can be troublesome in certain use cases.
  • KL Divergence is a popular test for monitoring changes in data distributions, but it's not symmetric and can lead to incorrect comparisons.
  • Sid explains that KL divergence measures the slight surprisal or entropy difference between moving from one data distribution to another, and is not the same as KS test.

Metrics for monitoring AI model changes. 10:41

  • The limitations of KL divergence and its alternatives, including Jenson Shannon divergence and population stability index.
  • They highlight the issues with KL divergence, such as asymmetry and handling of zeros, and the advantages of Jenson Shannon divergence, which can handle both issues, and population stability index, which provides a quantitative measure of changes in model distributions.
  • The popularity of information theory metrics in AI and ML is largely due to legacy and a lack of understanding of the underlying concepts.
  • Information theory metrics may not be the best choice for quantifying change in risk in the AI and ML space, but they are the ones that are commonly used due to familiarity and ease of use.

Using nonparametric statistics in modeling systems. 15:09

  • Information theory divergences are not useful for monitoring production model performance, according to the speakers.
  • Andrew Clark highlights the advantages of using nonparametric statistics in machine learning, including distribution agnosticism and the ability to test for significance without knowing the underlying distribution.
  • Sid Mangalik and Andrew Clark recommend using nonparametric tests such as the KS test and chi-square test to supplement divergences and provide m

Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

  • LinkedIn - Episode summaries, shares of cited articles, and more.
  • YouTube - Was it something that we said? Good. Share your favorite quotes.
  • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
  continue reading

17 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드