Africa-focused technology, digital and innovation ecosystem insight and commentary.
…
continue reading
The Data Flowcast에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The Data Flowcast 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!
Player FM 앱으로 오프라인으로 전환하세요!
From Sensors to Datasets: Enhancing Airflow at Astronomer with Maggie Stark and Marion Azoulai
Manage episode 436983924 series 2948506
The Data Flowcast에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The Data Flowcast 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
A 13% reduction in failure rates — this is how two data scientists at Astronomer revolutionized their data pipelines using Apache Airflow. In this episode, we enter the world of data orchestration and AI with Maggie Stark and Marion Azoulai, both Senior Data Scientists at Astronomer. Maggie and Marion discuss how their team re-architected their use of Airflow to improve scalability, reliability and efficiency in data processing. They share insights on overcoming challenges with sensors and how moving to datasets transformed their workflows. Key Takeaways: (02:23) The data team’s role as a centralized hub within Astronomer. (05:11) Airflow is the backbone of all data processes, running 60,000 tasks daily. (07:13) Custom task groups enable efficient code reuse and adherence to best practices. (11:33) Sensor-heavy architectures can lead to cascading failures and resource issues. (12:09) Switching to datasets has improved reliability and scalability. (14:19) Building a control DAG provides end-to-end visibility of pipelines. (16:42) Breaking down DAGs into smaller units minimizes failures and improves management. (19:02) Failure rates improved from 16% to 3% with the new architecture. Resources Mentioned: Maggie Stark - https://www.linkedin.com/in/margaretstark/ Marion Azoulai - https://www.linkedin.com/in/marionazoulai/ Astronomer | LinkedIn - https://www.linkedin.com/company/astronomer/ Apache Airflow - https://airflow.apache.org/ Astronomer | Website - https://www.astronomer.io/ Thanks for listening to The Data Flowcast: Mastering Airflow for Data Engineering & AI. If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning
…
continue reading
37 에피소드
From Sensors to Datasets: Enhancing Airflow at Astronomer with Maggie Stark and Marion Azoulai
The Data Flowcast: Mastering Airflow for Data Engineering & AI
Manage episode 436983924 series 2948506
The Data Flowcast에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 The Data Flowcast 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
A 13% reduction in failure rates — this is how two data scientists at Astronomer revolutionized their data pipelines using Apache Airflow. In this episode, we enter the world of data orchestration and AI with Maggie Stark and Marion Azoulai, both Senior Data Scientists at Astronomer. Maggie and Marion discuss how their team re-architected their use of Airflow to improve scalability, reliability and efficiency in data processing. They share insights on overcoming challenges with sensors and how moving to datasets transformed their workflows. Key Takeaways: (02:23) The data team’s role as a centralized hub within Astronomer. (05:11) Airflow is the backbone of all data processes, running 60,000 tasks daily. (07:13) Custom task groups enable efficient code reuse and adherence to best practices. (11:33) Sensor-heavy architectures can lead to cascading failures and resource issues. (12:09) Switching to datasets has improved reliability and scalability. (14:19) Building a control DAG provides end-to-end visibility of pipelines. (16:42) Breaking down DAGs into smaller units minimizes failures and improves management. (19:02) Failure rates improved from 16% to 3% with the new architecture. Resources Mentioned: Maggie Stark - https://www.linkedin.com/in/margaretstark/ Marion Azoulai - https://www.linkedin.com/in/marionazoulai/ Astronomer | LinkedIn - https://www.linkedin.com/company/astronomer/ Apache Airflow - https://airflow.apache.org/ Astronomer | Website - https://www.astronomer.io/ Thanks for listening to The Data Flowcast: Mastering Airflow for Data Engineering & AI. If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning
…
continue reading
37 에피소드
Alle episoder
×플레이어 FM에 오신것을 환영합니다!
플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.