Artwork

Machine Learning Street Talk (MLST)에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Machine Learning Street Talk (MLST) 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Sepp Hochreiter - LSTM: The Comeback Story?

1:07:01
 
공유
 

Manage episode 466223071 series 2803422
Machine Learning Street Talk (MLST)에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Machine Learning Street Talk (MLST) 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Sepp Hochreiter, the inventor of LSTM (Long Short-Term Memory) networks – a foundational technology in AI. Sepp discusses his journey, the origins of LSTM, and why he believes his latest work, XLSTM, could be the next big thing in AI, particularly for applications like robotics and industrial simulation. He also shares his controversial perspective on Large Language Models (LLMs) and why reasoning is a critical missing piece in current AI systems.

SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/

***

TRANSCRIPT AND BACKGROUND READING:

https://www.dropbox.com/scl/fi/n1vzm79t3uuss8xyinxzo/SEPPH.pdf?rlkey=fp7gwaopjk17uyvgjxekxrh5v&dl=0

Prof. Sepp Hochreiter

https://www.nx-ai.com/

https://x.com/hochreitersepp

https://scholar.google.at/citations?user=tvUH3WMAAAAJ&hl=en

TOC:

1. LLM Evolution and Reasoning Capabilities

[00:00:00] 1.1 LLM Capabilities and Limitations Debate

[00:03:16] 1.2 Program Generation and Reasoning in AI Systems

[00:06:30] 1.3 Human vs AI Reasoning Comparison

[00:09:59] 1.4 New Research Initiatives and Hybrid Approaches

2. LSTM Technical Architecture

[00:13:18] 2.1 LSTM Development History and Technical Background

[00:20:38] 2.2 LSTM vs RNN Architecture and Computational Complexity

[00:25:10] 2.3 xLSTM Architecture and Flash Attention Comparison

[00:30:51] 2.4 Evolution of Gating Mechanisms from Sigmoid to Exponential

3. Industrial Applications and Neuro-Symbolic AI

[00:40:35] 3.1 Industrial Applications and Fixed Memory Advantages

[00:42:31] 3.2 Neuro-Symbolic Integration and Pi AI Project

[00:46:00] 3.3 Integration of Symbolic and Neural AI Approaches

[00:51:29] 3.4 Evolution of AI Paradigms and System Thinking

[00:54:55] 3.5 AI Reasoning and Human Intelligence Comparison

[00:58:12] 3.6 NXAI Company and Industrial AI Applications

REFS:

[00:00:15] Seminal LSTM paper establishing Hochreiter's expertise (Hochreiter & Schmidhuber)

https://direct.mit.edu/neco/article-abstract/9/8/1735/6109/Long-Short-Term-Memory

[00:04:20] Kolmogorov complexity and program composition limitations (Kolmogorov)

https://link.springer.com/article/10.1007/BF02478259

[00:07:10] Limitations of LLM mathematical reasoning and symbolic integration (Various Authors)

https://www.arxiv.org/pdf/2502.03671

[00:09:05] AlphaGo’s Move 37 demonstrating creative AI (Google DeepMind)

https://deepmind.google/research/breakthroughs/alphago/

[00:10:15] New AI research lab in Zurich for fundamental LLM research (Benjamin Crouzier)

https://tufalabs.ai

[00:19:40] Introduction of xLSTM with exponential gating (Beck, Hochreiter, et al.)

https://arxiv.org/abs/2405.04517

[00:22:55] FlashAttention: fast & memory-efficient attention (Tri Dao et al.)

https://arxiv.org/abs/2205.14135

[00:31:00] Historical use of sigmoid/tanh activation in 1990s (James A. McCaffrey)

https://visualstudiomagazine.com/articles/2015/06/01/alternative-activation-functions.aspx

[00:36:10] Mamba 2 state space model architecture (Albert Gu et al.)

https://arxiv.org/abs/2312.00752

[00:46:00] Austria’s Pi AI project integrating symbolic & neural AI (Hochreiter et al.)

https://www.jku.at/en/institute-of-machine-learning/research/projects/

[00:48:10] Neuro-symbolic integration challenges in language models (Diego Calanzone et al.)

https://openreview.net/forum?id=7PGluppo4k

[00:49:30] JKU Linz’s historical and neuro-symbolic research (Sepp Hochreiter)

https://www.jku.at/en/news-events/news/detail/news/bilaterale-ki-projekt-unter-leitung-der-jku-erhaelt-fwf-cluster-of-excellence/

YT: https://www.youtube.com/watch?v=8u2pW2zZLCs

  continue reading

235 에피소드

Artwork
icon공유
 
Manage episode 466223071 series 2803422
Machine Learning Street Talk (MLST)에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Machine Learning Street Talk (MLST) 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Sepp Hochreiter, the inventor of LSTM (Long Short-Term Memory) networks – a foundational technology in AI. Sepp discusses his journey, the origins of LSTM, and why he believes his latest work, XLSTM, could be the next big thing in AI, particularly for applications like robotics and industrial simulation. He also shares his controversial perspective on Large Language Models (LLMs) and why reasoning is a critical missing piece in current AI systems.

SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/

***

TRANSCRIPT AND BACKGROUND READING:

https://www.dropbox.com/scl/fi/n1vzm79t3uuss8xyinxzo/SEPPH.pdf?rlkey=fp7gwaopjk17uyvgjxekxrh5v&dl=0

Prof. Sepp Hochreiter

https://www.nx-ai.com/

https://x.com/hochreitersepp

https://scholar.google.at/citations?user=tvUH3WMAAAAJ&hl=en

TOC:

1. LLM Evolution and Reasoning Capabilities

[00:00:00] 1.1 LLM Capabilities and Limitations Debate

[00:03:16] 1.2 Program Generation and Reasoning in AI Systems

[00:06:30] 1.3 Human vs AI Reasoning Comparison

[00:09:59] 1.4 New Research Initiatives and Hybrid Approaches

2. LSTM Technical Architecture

[00:13:18] 2.1 LSTM Development History and Technical Background

[00:20:38] 2.2 LSTM vs RNN Architecture and Computational Complexity

[00:25:10] 2.3 xLSTM Architecture and Flash Attention Comparison

[00:30:51] 2.4 Evolution of Gating Mechanisms from Sigmoid to Exponential

3. Industrial Applications and Neuro-Symbolic AI

[00:40:35] 3.1 Industrial Applications and Fixed Memory Advantages

[00:42:31] 3.2 Neuro-Symbolic Integration and Pi AI Project

[00:46:00] 3.3 Integration of Symbolic and Neural AI Approaches

[00:51:29] 3.4 Evolution of AI Paradigms and System Thinking

[00:54:55] 3.5 AI Reasoning and Human Intelligence Comparison

[00:58:12] 3.6 NXAI Company and Industrial AI Applications

REFS:

[00:00:15] Seminal LSTM paper establishing Hochreiter's expertise (Hochreiter & Schmidhuber)

https://direct.mit.edu/neco/article-abstract/9/8/1735/6109/Long-Short-Term-Memory

[00:04:20] Kolmogorov complexity and program composition limitations (Kolmogorov)

https://link.springer.com/article/10.1007/BF02478259

[00:07:10] Limitations of LLM mathematical reasoning and symbolic integration (Various Authors)

https://www.arxiv.org/pdf/2502.03671

[00:09:05] AlphaGo’s Move 37 demonstrating creative AI (Google DeepMind)

https://deepmind.google/research/breakthroughs/alphago/

[00:10:15] New AI research lab in Zurich for fundamental LLM research (Benjamin Crouzier)

https://tufalabs.ai

[00:19:40] Introduction of xLSTM with exponential gating (Beck, Hochreiter, et al.)

https://arxiv.org/abs/2405.04517

[00:22:55] FlashAttention: fast & memory-efficient attention (Tri Dao et al.)

https://arxiv.org/abs/2205.14135

[00:31:00] Historical use of sigmoid/tanh activation in 1990s (James A. McCaffrey)

https://visualstudiomagazine.com/articles/2015/06/01/alternative-activation-functions.aspx

[00:36:10] Mamba 2 state space model architecture (Albert Gu et al.)

https://arxiv.org/abs/2312.00752

[00:46:00] Austria’s Pi AI project integrating symbolic & neural AI (Hochreiter et al.)

https://www.jku.at/en/institute-of-machine-learning/research/projects/

[00:48:10] Neuro-symbolic integration challenges in language models (Diego Calanzone et al.)

https://openreview.net/forum?id=7PGluppo4k

[00:49:30] JKU Linz’s historical and neuro-symbolic research (Sepp Hochreiter)

https://www.jku.at/en/news-events/news/detail/news/bilaterale-ki-projekt-unter-leitung-der-jku-erhaelt-fwf-cluster-of-excellence/

YT: https://www.youtube.com/watch?v=8u2pW2zZLCs

  continue reading

235 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생