Player FM - Internet Radio Done Right
26 subscribers
Checked 2d ago
추가했습니다 three 년 전
Salim Virji에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Salim Virji 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!
Player FM 앱으로 오프라인으로 전환하세요!
들어볼 가치가 있는 팟캐스트
스폰서 후원
S
Squid Game: The Official Podcast


Squid Game is back—and this time, the knives are out. In the thrilling Season 3 premiere, Player 456 is spiraling and a brutal round of hide-and-seek forces players to kill or be killed. Hosts Phil Yu and Kiera Please break down Gi-hun’s descent into vengeance, Guard 011’s daring betrayal of the Game, and the shocking moment players are forced to choose between murdering their friends… or dying. Then, Carlos Juico and Gavin Ruta from the Jumpers Jump podcast join us to unpack their wild theories for the season. Plus, Phil and Kiera face off in a high-stakes round of “Hot Sweet Potato.” SPOILER ALERT! Make sure you watch Squid Game Season 3 Episode 1 before listening on. Play one last time. IG - @SquidGameNetflix X (f.k.a. Twitter) - @SquidGame Check out more from Phil Yu @angryasianman , Kiera Please @kieraplease and the Jumpers Jump podcast Listen to more from Netflix Podcasts . Squid Game: The Official Podcast is produced by Netflix and The Mash-Up Americans.…
Google SRE Prodcast
모두 재생(하지 않음)으로 표시
Manage series 3338250
Salim Virji에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Salim Virji 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
SRE Prodcast brings Google's experience with Site Reliability Engineering together with special guests and exciting topics to discuss the present and future of reliable production engineering!
…
continue reading
42 에피소드
모두 재생(하지 않음)으로 표시
Manage series 3338250
Salim Virji에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Salim Virji 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
SRE Prodcast brings Google's experience with Site Reliability Engineering together with special guests and exciting topics to discuss the present and future of reliable production engineering!
…
continue reading
42 에피소드
모든 에피소드
×G
Google SRE Prodcast

Google Staff SRE Ramón Llamas and Google Software Engineer Swapnil Haria join our hosts to explore how AI agents are revolutionizing production management, from summarizing alerts and finding hidden errors to proactively preventing outages. Learn about the challenges of evaluating non-deterministic systems and the fascinating interplay between human expertise and emerging AI capabilities in ensuring robust and reliable infrastructure.…
G
Google SRE Prodcast

1 The One with Technical Program Managers and Karanveer Anand 27:48
27:48
나중에 재생
나중에 재생
리스트
좋아요
좋아요27:48
This episode features Google Technical Program Manager (TPM) Karanveer Anand, who joins our hosts to discuss the unique role of TPMs in Site Reliability Engineering (SRE). The conversation highlights how SRE TPMs bridge the gap between technical details and business impact, managing complex projects with inter-team dependencies and ensuring system reliability, particularly in the rapidly evolving AI landscape.…
G
Google SRE Prodcast

This episode discusses Systems Theoretic Process Analysis (STPA), a method for analyzing complex systems. Theo Klein, a Google SRE, and Jeffrey Snover, a Distinguished Engineer at Google, explain that STPA focuses on identifying how system accidents and losses occur due to a loss of control, rather than component failures. STPA helps identify design flaws early, even before code is written! The discussion highlights that STPA is a human-driven process, prompting critical questions about system goals and potential losses, and that Google is adapting the pure STPA approach for commercial software development to make it more practical and efficient.…
G
Google SRE Prodcast

In this episode, hosts Steve McGhee and Matt Siegler are joined by guest, Adam Fletcher, CEO and Co-Founder of MarketStreet. They discuss the current state of web development with LLMs, managing technical debt in startups, the evolution of infrastructure and reliability engineering, the role of community in technology, and the future of software engineering with AI.…
In this episode, Sal Furino, Customer Reliability Engineer at Bloomberg, discusses all things Service Level Objectives (SLOs) with hosts Steve McGhee and Matt Siegler. Together, they dig into what successful SLOs look like, how it relates to users, and how SLOs provide an effective framework for joint decisions about system reliability across product, engineering, and leadership teams.…
G
Google SRE Prodcast

Matt Zelesko, the head of Site Reliability Engineering at Google, discusses the evolution of SRE, highlighting the shift from traditional operations to a model that balances velocity and reliability to better serve the rapid advancements in AI and ML. He emphasizes that SRE's core mission is to enable partners to move quickly while meeting reliability goals, and that the sheer scale of Google's infrastructure necessitates the SRE model for cross-system problem-solving. Zelesko envisions AI as a crucial assistant for SREs, improving incident detection, mitigation, and postmortem processes, and allowing SREs to focus on more complex engineering challenges and risk management earlier in the development cycle, while still valuing the hands-on experience of operating production infrastructure.…
In this Google Prodcast episode, Todd Underwood, a reliability expert from Anthropic with experience at Google and OpenAI, discusses the current state and future of AI in SRE. Todd and the hosts focus on the current state and future of AI and ML in production, particularly for SREs. Topics discussed include the challenges of AI-Ops, limitations of current anomaly detection, the potential for AI in config authoring and troubleshooting, trade-offs between product velocity and reliability, the evolving role of SREs in an AI-driven world, and book publication for optimal timing.…
G
Google SRE Prodcast

This episode features guest, Peter Pellerzi (Distinguished Engineer, Google). Peter and the hosts, Matt Siegler and Steve McGhee, focus on the physical infrastructure side of SRE, discussing topics such as the scale of Google's data centers, handling incidents like power outages, testing and preparedness strategies, the use of AI for optimizing cooling plants, and more. Peter also emphasizes the importance of community support, proactive planning, and learning from real-world testing and incidents to ensure high availability and resilience in data center operations.…
G
Google SRE Prodcast

Jessica Theodat (Senior SRE & Security Tech Lead, Google) joins hosts Jordan Greenberg and Steve McGhee to discuss the intersection of security and site reliability engineering at Google. Jessica touches on risk management, the unique nature of security incident responses, and the shared goals between security and SRE. The crew also delves into the balance between security and SRE, acknowledging the tension and the need for collaboration between teams to achieve business goals and user trust.…
In this "bumpisode", hosts and producers of Prodcast (including our new co-host, Matt Siegler!) reflect on the previous season and introduce the new season's focus on upcoming trends in Site Reliability Engineering (SRE) and AI, and the friends we make along the way. They also introduce new elements we are bringing in with Season 4, such as a video format and a feedback form.…
G
Google SRE Prodcast

This episode features Javi Beltran, a Google engineering lead who created the "Telebot" theme song. With our beloved hosts, Steve McGhee and Jordan Greenberg, Beltran discusses the origins of the song, created in 2012 for Google's paging system. The song was meant to add a touch of levity to what could be a stressful situation for engineers on-call. Beltran also unveils a new, more modern remix of “Telebot” (created in collaboration with our host, Jordan Greenberg!) which will be used as the intro theme for the podcast's next season.…
G
Google SRE Prodcast

1 Imperative vs. Declarative Change Workflows with Dominic Hutton & Niccolo' Cascarano 36:10
36:10
나중에 재생
나중에 재생
리스트
좋아요
좋아요36:10
In this episode of the Prodcast, guests Dominic Hutton (Staff SRE, HashiCorp) and Niccolo' Cascarano (Senior Staff SRE at Google) join hosts Steve McGhee and Jordan Greenberg to dive into configurations. They discuss the differences between imperative and declarative configuration, explore the benefits and challenges of each approach, and the need for careful consideration when choosing between the two. Ultimately, the goal is to achieve reliable and maintainable systems through effective configuration management.…
G
Google SRE Prodcast

1 Human Factors in Complex Systems with Casey Rosenthal and John Allspaw 41:18
41:18
나중에 재생
나중에 재생
리스트
좋아요
좋아요41:18
This episode features Casey Rosenthal (Founder, Cirrusly.ai) and John Allspaw (Founder and Principal, Adaptive Capacity Labs), joining our hosts Steve McGhee and Jordan Greenberg. Together they discuss how resilience appears in Software Engineering and SRE and explore the importance of understanding the human factors involved in adapting to system failures—highlighting the need for a more qualitative and holistic approach to understanding how engineers successfully adapt to system behavior and improving overall reliability.…
G
Google SRE Prodcast

1 Embracing Complexity with Christina Schulman & Dr. Laura Maguire 33:59
33:59
나중에 재생
나중에 재생
리스트
좋아요
좋아요33:59
In this episode of the Prodcast, we are joined by guests Christina Schulman (Staff SRE, Google) and Dr. Laura Maguire (Principal Engineer, Trace Cognitive Engineering). They emphasize the human element of SRE and the importance of fostering a culture of collaboration, learning, and resilience in managing complex systems. They touch upon topics such as the need for diverse perspectives and collaboration in incident response, the necessity of embracing complexity, and explore concepts such as aerodynamic stability, and more.…
G
Google SRE Prodcast

1 Maglev: load balancing at Google with Cody Smith and Trisha Weir 32:53
32:53
나중에 재생
나중에 재생
리스트
좋아요
좋아요32:53
In this episode, Cody Smith (CTO and Co-founder, Camus Energy) & Trisha Weir (SRE Department Lead, Google) join hosts Steve McGhee and Jordan Greenberg, to discuss their experience developing Maglev , a highly available and distributed network load balancer (NLB) that is an integral part of the cloud architecture that manages traffic that comes in to a datacenter. Starting with Maglev’s humble beginnings as a skunkworks effort, Cody and Trisha recount the challenges they faced, and emphasize the importance of psychological safety, collaboration, and adaptability in SRE innovation.…
G
Google SRE Prodcast

In this episode, guests Narayan Desai (Principal SRE, Google) and Pat Somaru (Senior Production Engineer, Meta) join hosts Steve McGhee and Florian Rathgeber to discuss the challenges of observability and working with profiling data. The discussion covers intriguing topics like noise reduction, workload modeling, and the need for better tools and techniques to handle high-cardinality data.…
G
Google SRE Prodcast

1 Google Public DNS (8.8.8.8) with Wilmer van der Gaast and Andy Sykes 32:07
32:07
나중에 재생
나중에 재생
리스트
좋아요
좋아요32:07
This episode features Google engineers Wilmer van der Gaast (Production on-tall) and Andy Sykes (Senior Staff Systems Engineer, SRE), joining hosts Steve McGhee and Jordan Greenberg, to discuss the development and maintenance of Google Public DNS (8.8.8.8). They highlight the initial motivations for creating the service, technical challenges like cache poisoning and load balancing, as well as the collaborative effort between SRE and SWE teams to address these issues. They also reflect on the evolving nature of SRE and advice for aspiring SREs.…
G
Google SRE Prodcast

1 SRE in the Retail and Gaming Worlds with Jordan Chernev & Scott Bowers 33:40
33:40
나중에 재생
나중에 재생
리스트
좋아요
좋아요33:40
Guests Jordan Chernev (Senior Technology Executive) and Scott Bowers (SRE, Gearbox Software) who hail from the retail and gaming industries, respectively, join hosts Steve McGhee and Jordan Greenberg to discuss the unique challenges of Site Reliability Engineering in their industries. They share the importance of aligning SLOs with user experience, strategies for handling spikes in traffic, communicating with users during outages, and investing in reliability.…
G
Google SRE Prodcast

Sarah Butt (Principal Engineer, Centralized Incident Response, Salesforce) and Vrai Stacey (Staff Software Engineer, Google) join hosts Steve McGhee and Jordan Greenberg to dive into incident response—particularly tooling and software for reliability incidents. Tune in for an in-depth discussion on topics such as the importance of communication and collaboration during incidents, and the role of tooling in supporting incident response processes. Sarah and Vrai also share personal takeaways from incidents they have experienced.…
G
Google SRE Prodcast

1 Building Reliable Systems with Silvia Botros and Niall Murphy 42:06
42:06
나중에 재생
나중에 재생
리스트
좋아요
좋아요42:06
Silvia Botros (SRE Architect, Twilio | Author of "High Performance MySQL, 4th edition”) and Niall Murphy (Co-founder & CEO, Stanza) join hosts Steve McGhee and Jordan Greenberg, to discuss cultural shifts in database engineering, rate limiting, load shedding, holistic approaches to reliability, proactive measures to build customer trust, and much more!…
G
Google SRE Prodcast

Liz Fong-Jones (former Google SRE and current Field CTO at honeycomb.io) joins hosts Steve McGhee and Jordan Greenberg for a lively discussion centered around observability, its evolution from monitoring, and its role in modern software development. Tune in for more on the importance of observability as a spectrum, the evolving role of SREs, and advice to aspiring software engineers.…
G
Google SRE Prodcast

Ben Treynor Sloss (VP of Engineering, Google) joins hosts Steve McGhee and Dr. Jennifer Petoff (Director of Technical Infrastructure Education, Google) to share the evolution of SRE and its impact on software development, how AI and ML significantly impacts SRE practices, and the future of SRE. Ben coined the term "Site Reliability Engineering" for his team of (now) 4,000 software engineers, engaged in what were traditionally operations functions. Under Ben's leadership, Google SRE wrote two best-selling books on SRE. Since then, the rest of the SaaS industry has come to adopt the SRE name, mission, and practices.…
G
Google SRE Prodcast

1 There Remains a Huge Amount of Work to Do, with Healfdene Goguen 26:14
26:14
나중에 재생
나중에 재생
리스트
좋아요
좋아요26:14
In this episode, Healfdene Goguen (Principal Engineer, Google) joins hosts Steve McGhee and Jordan Greenberg to discuss the vast amount of work to be done by SREs, and the fascinating challenges to tackle with clear real-world implications. It's a truly exciting time to be an SRE at Google!
G
Google SRE Prodcast

1 SRE, a Basis of Influence, with Amy Tobey & Vladyslav Ukis 41:02
41:02
나중에 재생
나중에 재생
리스트
좋아요
좋아요41:02
In this season of Google Prodcast, current and former SREs, both within and outside of Google, chat with hosts Steve McGhee and Jordan Greenberg to discuss software systems designed and built by SREs. For "episode zero", guests Amy Tobey (Live Services SRE, Netflix) and Dr. Vladyslav Ukis (Head of R&D, Siemens Healthineers, Author of "Establishing SRE Foundations") will set the stage for the season with a lively discussion about what Software Engineering means to Site Reliability Engineering.…
G
Google SRE Prodcast

1 Life of An SRE: Life after Google SRE, with Carla Geisser, Cody Smith, and Laura Nolan 46:32
46:32
나중에 재생
나중에 재생
리스트
좋아요
좋아요46:32
Former Google SREs, or "Xooglers", talk with hosts MP and Steve McGhee about site reliability engineering outside of Google. What’s the difference in scale? What skills are generally valuable? And why can’t you build “SRE in a box” that jump-starts pretty much any organization? Join Carla Geisser, Cody Smith, and Laura Nolan in their lively conversation about what SRE skills and knowledge they have found useful in roles outside of Google.…
Sabrina Farmer, VP of Engineering at Google, talks about her career journey through Site Reliability Engineering. What does management mean? What’s involved in being an effective manager? and what’s a feasibility study? Hear some great advice on how to get what you expect out of a role, wherever on the ladder it is.…
Dave Reisner talks about his path to Staff SRE, from ArchLinux contributor through DevOps to software engineer. This episode emphasizes the value of strong mentoring and manager relationships, and the challenges of work-life balance.
G
Google SRE Prodcast

Explore the role and responsibilities of an SRE manager with Stephen Benjamin.
Explore the role and responsibilities of a Senior SRE with Jessica Theodat, as she discusses life-work balance, the value of mentoring, and being a Black woman in SRE.
G
Google SRE Prodcast

Explore the career path of SREs Shannon Brady and Theo Klein as they discusses their paths to Site Reliability Engineering and finding their areas of expertise.
G
Google SRE Prodcast

In this episode, Mariuxi and Julian discuss their paths to SRE: what drew them initially to SRE, and what motivates them to continue developing skills
G
Google SRE Prodcast

How does one become an SRE? And what’s the career like? In this episode, Tom and Megan discuss their path to SRE.
G
Google SRE Prodcast

Host MP English and former Google SRE John Reese (JTR) chat about the creation of the Prodcast. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
Ayelet Sachto offers advice on creating an actionable, transparent, and blameless postmortem culture. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
G
Google SRE Prodcast

Adrienne Walcer discusses how to approach and organize incident management efforts throughout the production lifecycle. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
G
Google SRE Prodcast

Andrew Widdowson (APW) shares strategies for successful on-call rotations. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
G
Google SRE Prodcast

Pierre Palatin dives into different automation strategies, how to build confidence in your system, and why designing the UI may be your biggest challenge. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
G
Google SRE Prodcast

Pavan Adharapurapu details how to approach large-scale migrations while optimizing for user experience. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
Narayan Desai explains why SLOs can be problematic and proposes alternative methods for monitoring complex, large-scale systems. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
Amelia Harrison advises on when and how to alert, ideal coverage, and tuning. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
G
Google SRE Prodcast

Silvia Esparrachiari talks about the challenges of monitoring and the importance of understanding your users. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
G
Google SRE Prodcast

What is SRE, anyway? Jennifer Mace (Macey) gives us her definition of "site reliability engineer," discusses how to manage risk, and shares key questions to ask developers. Visit https://sre.google/prodcast for transcripts and links to further reading. View transcript
플레이어 FM에 오신것을 환영합니다!
플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.