Artwork

The New Stack Podcast and The New Stack에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The New Stack Podcast and The New Stack 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Kubernetes GPU Management Just Got a Major Upgrade

35:26
 
공유
 

Manage episode 523776454 series 2574278
The New Stack Podcast and The New Stack에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The New Stack Podcast and The New Stack 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Nvidia Distinguished Engineer Kevin Klues noted that low-level systems work is invisible when done well and highly visible when it fails — a dynamic that frames current Kubernetes innovations for AI. At KubeCon + CloudNativeCon North America 2025, Klues and AWS product manager Jesse Butler discussed two emerging capabilities: dynamic resource allocation (DRA) and a new workload abstraction designed for sophisticated AI scheduling.

DRA, now generally available in Kubernetes 1.34, fixes long-standing limitations in GPU requests. Instead of simply asking for a number of GPUs, users can specify types and configurations. Modeled after persistent volumes, DRA allows any specialized hardware to be exposed through standardized interfaces, enabling vendors to deliver custom device drivers cleanly. Butler called it one of the most elegant designs in Kubernetes.

Yet complex AI workloads require more coordination. A forthcoming workload abstraction, debuting in Kubernetes 1.35, will let users define pod groups with strict scheduling and topology rules — ensuring multi-node jobs start fully or not at all. Klues emphasized that this abstraction will shape Kubernetes’ AI trajectory for the next decade and encouraged community involvement.

Learn more from The New Stack about dynamic resource allocation:

Kubernetes Primer: Dynamic Resource Allocation (DRA) for GPU Workloads

Kubernetes v1.34 Introduces Benefits but Also New Blind Spots

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

  continue reading

306 에피소드

Artwork
icon공유
 
Manage episode 523776454 series 2574278
The New Stack Podcast and The New Stack에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The New Stack Podcast and The New Stack 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Nvidia Distinguished Engineer Kevin Klues noted that low-level systems work is invisible when done well and highly visible when it fails — a dynamic that frames current Kubernetes innovations for AI. At KubeCon + CloudNativeCon North America 2025, Klues and AWS product manager Jesse Butler discussed two emerging capabilities: dynamic resource allocation (DRA) and a new workload abstraction designed for sophisticated AI scheduling.

DRA, now generally available in Kubernetes 1.34, fixes long-standing limitations in GPU requests. Instead of simply asking for a number of GPUs, users can specify types and configurations. Modeled after persistent volumes, DRA allows any specialized hardware to be exposed through standardized interfaces, enabling vendors to deliver custom device drivers cleanly. Butler called it one of the most elegant designs in Kubernetes.

Yet complex AI workloads require more coordination. A forthcoming workload abstraction, debuting in Kubernetes 1.35, will let users define pod groups with strict scheduling and topology rules — ensuring multi-node jobs start fully or not at all. Klues emphasized that this abstraction will shape Kubernetes’ AI trajectory for the next decade and encouraged community involvement.

Learn more from The New Stack about dynamic resource allocation:

Kubernetes Primer: Dynamic Resource Allocation (DRA) for GPU Workloads

Kubernetes v1.34 Introduces Benefits but Also New Blind Spots

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

  continue reading

306 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생