This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
…
continue reading
Welcome to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI— the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io ...
…
continue reading
The Data Engineering Show is a podcast for data engineering and BI practitioners to go beyond theory. Learn from the biggest influencers in tech about their practical day-to-day data challenges and solutions in a casual and fun setting. SEASON 1 DATA BROS Eldad and Boaz Farkash shared the same stuffed toys growing up as well as a big passion for data. After founding Sisense and building it to become a high-growth analytics unicorn, they moved on to their next venture, Firebolt, a leading hig ...
…
continue reading
Discussions around Data Engineering
…
continue reading
Databases and data engineering episodes of Software Engineering Daily
…
continue reading
Unlocking the Power of Data: A Guide for Leaders and Executives" As a leader or executive, you know the importance of data in driving business decisions and staying ahead of the competition. But, with the increasing amount of data generated daily, it can be overwhelming to know where to start and how to utilize this valuable asset effectively. This blog, with multiple topics, addresses the technical terminology in data engineering and analytics on the cloud.
…
continue reading

1
Building a Data-Driven Beauty and Wellness Marketplace at StyleSeat with Paschal Onuorah
23:05
23:05
나중에 재생
나중에 재생
리스트
좋아요
좋아요
23:05StyleSeat is revolutionizing how beauty and wellness professionals grow their businesses through data-driven tools. From streamlining scheduling to optimizing marketing, their platform empowers professionals to focus on their craft while expanding their client base. In this episode, Paschal Onuorah, Senior Data Engineer at StyleSeat, shares how the…
…
continue reading

1
Aligning Business and Data: The Essential Role of Data Modeling
1:06:51
1:06:51
나중에 재생
나중에 재생
리스트
좋아요
좋아요
1:06:51Summary In this episode of the Data Engineering Podcast Serge Gershkovich, head of product at SQL DBM, talks about the socio-technical aspects of data modeling. Serge shares his background in data modeling and highlights its importance as a collaborative process between business stakeholders and data teams. He debunks common misconceptions that dat…
…
continue reading

1
Is Self-Service BI a False Promise? Lei Tang of Fabi.ai Thinks So
21:07
21:07
나중에 재생
나중에 재생
리스트
좋아요
좋아요
21:07Explore the future of AI-powered business intelligence with Lei Tang, CTO and Co-founder of Fabi.ai, as he discusses the evolution from traditional self-service BI to "Vibe-analytics." Learn how AI is transforming data accessibility, enabling anyone to perform sophisticated analytics without deep technical expertise. From building trust in AI-gener…
…
continue reading

1
Building the Future of Airflow Execution at Astronomer with Ian Buss and Piotr Chomiak
22:25
22:25
나중에 재생
나중에 재생
리스트
좋아요
좋아요
22:25The evolution of orchestration in Airflow continues with innovations that address both scalability and security. From improving executor reliability to enabling remote execution, these advancements reshape how organizations manage data pipelines. In this episode, we’re joined by Ian Buss, Principal Software Engineer at Astronomer, and Piotr Chomiak…
…
continue reading

1
From Academia to Industry: Bridging Data Engineering Challenges
50:54
50:54
나중에 재생
나중에 재생
리스트
좋아요
좋아요
50:54Summary In this episode of the Data Engineering Podcast Professor Paul Groth, from the University of Amsterdam, talks about his research on knowledge graphs and data engineering. Paul shares his background in AI and data management, discussing the evolution of data provenance and lineage, as well as the challenges of data integration. He explores t…
…
continue reading

1
Scaling On-Prem Airflow With 2,000 DAGs at Numberly with Sébastien Crocquevieille
24:17
24:17
나중에 재생
나중에 재생
리스트
좋아요
좋아요
24:17Scaling 2,000+ data pipelines isn’t easy. But with the right tools and a self-hosted mindset, it becomes achievable. In this episode, Sébastien Crocquevieille, Data Engineer at Numberly, unpacks how the team scaled their on-prem Airflow setup using open-source tooling and Kubernetes. We explore orchestration strategies, UI-driven stakeholder access…
…
continue reading

1
High Performance And Low Overhead Graphs With KuzuDB
1:01:29
1:01:29
나중에 재생
나중에 재생
리스트
좋아요
좋아요
1:01:29Summary In this episode of the Data Engineering Podcast Prashanth Rao, an AI engineer at KuzuDB, talks about their embeddable graph database. Prashanth explains how KuzuDB addresses performance shortcomings in existing solutions through columnar storage and novel join algorithms. He discusses the usability and scalability of KuzuDB, emphasizing its…
…
continue reading

1
How Moniepoint Group Uses Airflow for Exposure Monitoring with Adeolu Adegboye
21:32
21:32
나중에 재생
나중에 재생
리스트
좋아요
좋아요
21:32Managing financial data at scale requires precise orchestration and proactive monitoring to maintain operational efficiency. In this episode, we are joined by Adeolu Adegboye, Data Engineer at Moniepoint Group, who shares how his team uses data pipelines and workflow automation to manage high volumes of transactions, ensure timely alerts and suppor…
…
continue reading

1
Bridging Data and Decision-Making: AI's Role in Modern Analytics
1:10:44
1:10:44
나중에 재생
나중에 재생
리스트
좋아요
좋아요
1:10:44Summary In this episode of the Data Engineering Podcast Lucas Thelosen and Drew Gilson from Gravity talk about their development of Orion, an autonomous data analyst that bridges the gap between data availability and business decision-making. Lucas and Drew share their backgrounds in data analytics and how their experiences have shaped their approa…
…
continue reading

1
Inside Bosch’s Airflow 3 Revolution: Remote Execution with Jens Scheffler
28:02
28:02
나중에 재생
나중에 재생
리스트
좋아요
좋아요
28:02The evolution of Airflow has reached a milestone with the introduction of remote execution in Airflow 3, enabling flexible orchestration across distributed environments. In this episode, Jens Scheffler, Test Execution Cluster Technical Architect at Bosch, shares insights on how his team’s need for large-scale, cross-environment testing influenced t…
…
continue reading
Summary In this episode of the Data Engineering Podcast Andy Warfield talks about the innovative functionalities of S3 Tables and Vectors and their integration into modern data stacks. Andy shares his journey through the tech industry and his role at Amazon, where he collaborates to enhance storage capabilities, discussing the evolution of S3 from …
…
continue reading

1
Inside Modern Data Infrastructure at Massdriver with Cory O’Daniel and Jake Ferriero
31:24
31:24
나중에 재생
나중에 재생
리스트
좋아요
좋아요
31:24Managing modern data platforms means navigating a web of complex infrastructure, competing team needs and evolving security standards. For data teams to truly thrive, infrastructure must become both accessible and compliant without sacrificing velocity or reliability. In this episode, we’re joined by Cory O’Daniel, CEO and Co-Founder at Massdriver,…
…
continue reading
Summary In this episode of the Data Engineering Podcast Akshay Agrawal from Marimo discusses the innovative new Python notebook environment, which offers a reactive execution model, full Python integration, and built-in UI elements to enhance the interactive computing experience. He discusses the challenges of traditional Jupyter notebooks, such as…
…
continue reading

1
Building Uber's AI Assistant: How Genie Revolutionizes On-Call Support with Paarth Chothani from Uber
25:31
25:31
나중에 재생
나중에 재생
리스트
좋아요
좋아요
25:31Journey inside Uber's innovative AI assistant "Genie" with Paarth Chotani, Staff Engineer at Uber, as he shares how they're revolutionizing on-call support using LLMs and vector search. From processing massive amounts of internal documentation to building scalable RAG pipelines, discover how Uber tackles the challenges of implementing AI assistants…
…
continue reading

1
Warehouse Native Incremental Data Processing With Dynamic Tables And Delayed View Semantics
55:07
55:07
나중에 재생
나중에 재생
리스트
좋아요
좋아요
55:07Summary In this episode of the Data Engineering Podcast Dan Sotolongo from Snowflake talks about the complexities of incremental data processing in warehouse environments. Dan discusses the challenges of handling continuously evolving datasets and the importance of incremental data processing for optimized resource use and reduced latency. He expla…
…
continue reading
Telemetry has the potential to guide the future of Airflow, but only if it’s implemented transparently and with community trust. In this episode, we’re joined by Bolke de Bruin, Director at Metyis and a long-time Airflow PMC member. Bolke discusses how telemetry has been handled in the past, why it matters now and what it will take to get it right.…
…
continue reading

1
Streamlining Data Pipelines with MCP Servers and Vector Engines
52:04
52:04
나중에 재생
나중에 재생
리스트
좋아요
좋아요
52:04Summary In this episode of the Data Engineering Podcast Kacper Łukawski from Qdrant about integrating MCP servers with vector databases to process unstructured data. Kacper shares his experience in data engineering, from building big data pipelines in the automotive industry to leveraging large language models (LLMs) for transforming unstructured d…
…
continue reading

1
Transforming the Airflow UI for Cloudera’s Users with Shubham Raj
22:28
22:28
나중에 재생
나중에 재생
리스트
좋아요
좋아요
22:28Contributing to open-source projects can be daunting, but it can also unlock unexpected innovation. This episode showcases how one engineer’s journey with Apache Airflow led to impactful UI enhancements and infrastructure solutions at scale. Shubham Raj, Software Engineer II at Cloudera, shares how his team built a drag-and-drop DAG editor for non-…
…
continue reading

1
Streamlining Thousands of Data Pipelines at Lyft with Yunhao Qing
19:34
19:34
나중에 재생
나중에 재생
리스트
좋아요
좋아요
19:34Managing data pipelines at scale is not just a technical challenge. It is also an organizational one. At Lyft, success means empowering dozens of teams to build with autonomy while enforcing governance and best practices across thousands of workflows. In this episode, we speak with Yunhao Qing, Software Engineer at Lyft, about building a governed d…
…
continue reading
Summary In this episode of the Data Engineering Podcast Effie Baram, a leader in foundational data engineering at Two Sigma, talks about the complexities and innovations in data engineering within the finance sector. She discusses the critical role of data at Two Sigma, balancing data quality with delivery speed, and the socio-technical challenges …
…
continue reading

1
Enabling Agents In The Enterprise With A Platform Approach
54:18
54:18
나중에 재생
나중에 재생
리스트
좋아요
좋아요
54:18Summary In this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with agentic capabilities. From leading AI engineering at Deutsche Telekom to his current entrepreneurial venture focused on multi-agent systems, Arun shares insights on building agentic systems at an org…
…
continue reading

1
Transforming Customer Education in Data Engineering at Astronomer with Marc Lamberti
22:19
22:19
나중에 재생
나중에 재생
리스트
좋아요
좋아요
22:19Understanding the complexities of Apache Airflow can be daunting for newcomers and seasoned data engineers. But with the right guidance, mastering the tool becomes an achievable milestone. In this episode, Marc Lamberti, Head of Customer Education at Astronomer, joins us to share his journey from Udemy instructor to driving education at Astronomer,…
…
continue reading

1
Embracing Data Mesh and SQL Sensors for Scalable Workflows at lastminute.com with Alberto Crespi
30:09
30:09
나중에 재생
나중에 재생
리스트
좋아요
좋아요
30:09The flexibility of Airflow plays a pivotal role in enabling decentralized data architectures and empowering cross-functional teams. In this episode, we speak with Alberto Crespi, Data Architect at lastminute.com, who shares how his team scales Airflow across 12 teams while supporting both vertical and horizontal structures under a data mesh approac…
…
continue reading

1
Dagster's New Era: Modularizing Data Transformation in the Age of AI
1:01:37
1:01:37
나중에 재생
나중에 재생
리스트
좋아요
좋아요
1:01:37Summary In this episode of the Data Engineering Podcast we welcome back Nick Schrock, CTO and founder of Dagster Labs, to discuss the evolving landscape of data engineering in the age of AI. As AI begins to impact data platforms and the role of data engineers, Nick shares his insights on how it will ultimately enhance productivity and expand softwa…
…
continue reading

1
The AI-Ready Pipeline: Reimagining Airflow at Veyer® Logistics with Anu Pabla
23:21
23:21
나중에 재생
나중에 재생
리스트
좋아요
좋아요
23:21Innovation in orchestration is redefining how engineers approach both traditional ETL pipelines and emerging AI workloads. Understanding how to harness Airflow’s flexibility and observability is essential for teams navigating today’s evolving data landscape. In this episode, Anu Pabla, Principal Engineer at The ODP Corporation, joins us to discuss …
…
continue reading

1
AI and the Lakehouse: How Starburst is Pioneering New Workflows
44:09
44:09
나중에 재생
나중에 재생
리스트
좋아요
좋아요
44:09Summary In this episode of the Data Engineering Podcast Alex Albu, tech lead for AI initiatives at Starburst, talks about integrating AI workloads with the lakehouse architecture. From his software engineering roots to leading data engineering efforts, Alex shares insights on enhancing Starburst's platform to support AI applications, including an A…
…
continue reading

1
From Zero to 100M Users: Inside Notion’s Data Stack and AI Strategy with Sumit Gupta
22:13
22:13
나중에 재생
나중에 재생
리스트
좋아요
좋아요
22:13AI's transformative impact on data engineering and analytics is reshaping how professionals create value, shifting focus from technical skills to strategic thinking and communication. In this episode of The Data Engineering Show, the bros talk with Sumit Gupta, Lead BI Engineer at Notion, about his journey through prominent tech companies, modern d…
…
continue reading

1
Streamlining AI and ML Operations at IBM with BJ Adesoji and Ryan Yackel
24:44
24:44
나중에 재생
나중에 재생
리스트
좋아요
좋아요
24:44The orchestration layer is foundational to building robust AI- and ML-powered data pipelines, especially in complex hybrid enterprise environments. IBM’s partnership with Astronomer reflects a strategic alignment to simplify and scale Airflow-based workflows across industries. In this episode, we’re joined by IBM’s Senior Product Manager, BJ Adesoj…
…
continue reading
Summary In this episode of the Data Engineering Podcast Mai-Lan Tomsen Bukovec, Vice President of Technology at AWS, talks about the evolution of Amazon S3 and its profound impact on data architecture. From her work on compute systems to leading the development and operations of S3, Mylan shares insights on how S3 has become a foundational element …
…
continue reading

1
Inside the Custom Framework for Managing Airflow Code at Wix with Gil Reich
31:02
31:02
나중에 재생
나중에 재생
리스트
좋아요
좋아요
31:02Efficient orchestration and maintainability are crucial for data engineering at scale. Gil Reich, Data Developer for Data Science at Wix, shares how his team reduced code duplication, standardized pipelines, and improved Airflow task orchestration using a Python-based framework built within the data science team. In this episode, Gil explains how t…
…
continue reading
Summary In this episode of the Data Engineering Podcast Chakravarthy Kotaru talks about scaling data operations through standardized platform offerings. From his roots as an Oracle developer to leading the data platform at a major online travel company, Chakravarthy shares insights on managing diverse database technologies and providing databases a…
…
continue reading

1
Modernizing Legacy Data Systems With Airflow at Procter & Gamble with Adonis Castillo Cordero
22:13
22:13
나중에 재생
나중에 재생
리스트
좋아요
좋아요
22:13Legacy architecture and AI workloads pose unique challenges at scale, especially in a global enterprise with complex data systems. In this episode, we explore strategies to proactively monitor and optimize pipelines while minimizing downstream failures. Adonis Castillo Cordero, Senior Automation Manager at Procter & Gamble, joins us to share action…
…
continue reading

1
From Data Discovery to AI: The Evolution of Semantic Layers
49:30
49:30
나중에 재생
나중에 재생
리스트
좋아요
좋아요
49:30Summary In this episode of the Data Engineering Podcast, host Tobias Macy welcomes back Shinji Kim to discuss the evolving role of semantic layers in the era of AI. As they explore the challenges of managing vast data ecosystems and providing context to data users, they delve into the significance of semantic layers for AI applications. They dive i…
…
continue reading

1
Building an End-to-End Data Observability System at Netflix with Joseph Machado
38:54
38:54
나중에 재생
나중에 재생
리스트
좋아요
좋아요
38:54Building reliable data pipelines starts with maintaining strong data quality standards and creating efficient systems for auditing, publishing and monitoring. In this episode, we explore the real-world patterns and best practices for ensuring data pipelines stay accurate, scalable and trustworthy. Joseph Machado, Senior Data Engineer at Netflix, jo…
…
continue reading