Artwork

Tobias Macey에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Tobias Macey 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling

1:12:55
 
공유
 

Manage episode 370885556 series 3449056
Tobias Macey에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Tobias Macey 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Summary

For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.

Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack
  • Your host is Tobias Macey and today I'm interviewing Max Beauchemin about the concept of entity-centric data modeling for analytical use cases

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what entity-centric modeling (ECM) is and the story behind it?

    • How does it compare to dimensional modeling strategies?
    • What are some of the other competing methods
    • Comparison to activity schema
  • What impact does this have on ML teams? (e.g. feature engineering)

  • What role does the tooling of a team have in the ways that they end up thinking about modeling? (e.g. dbt vs. informatica vs. ETL scripts, etc.)

    • What is the impact on the underlying compute engine on the modeling strategies used?
  • What are some examples of data sources or problem domains for which this approach is well suited?

    • What are some cases where entity centric modeling techniques might be counterproductive?
  • What are the ways that the benefits of ECM manifest in use cases that are down-stream from the warehouse?

  • What are some concrete tactical steps that teams should be thinking about to implement a workable domain model using entity-centric principles?

    • How does this work across business domains within a given organization (especially at "enterprise" scale)?
  • What are the most interesting, innovative, or unexpected ways that you have seen ECM used?

  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on ECM?

  • When is ECM the wrong choice?

  • What are your predictions for the future direction/adoption of ECM or other modeling techniques?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
  • To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Sponsored By:

Support Data Engineering Podcast

  continue reading

483 에피소드

Artwork
icon공유
 
Manage episode 370885556 series 3449056
Tobias Macey에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Tobias Macey 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

Summary

For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.

Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack
  • Your host is Tobias Macey and today I'm interviewing Max Beauchemin about the concept of entity-centric data modeling for analytical use cases

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what entity-centric modeling (ECM) is and the story behind it?

    • How does it compare to dimensional modeling strategies?
    • What are some of the other competing methods
    • Comparison to activity schema
  • What impact does this have on ML teams? (e.g. feature engineering)

  • What role does the tooling of a team have in the ways that they end up thinking about modeling? (e.g. dbt vs. informatica vs. ETL scripts, etc.)

    • What is the impact on the underlying compute engine on the modeling strategies used?
  • What are some examples of data sources or problem domains for which this approach is well suited?

    • What are some cases where entity centric modeling techniques might be counterproductive?
  • What are the ways that the benefits of ECM manifest in use cases that are down-stream from the warehouse?

  • What are some concrete tactical steps that teams should be thinking about to implement a workable domain model using entity-centric principles?

    • How does this work across business domains within a given organization (especially at "enterprise" scale)?
  • What are the most interesting, innovative, or unexpected ways that you have seen ECM used?

  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on ECM?

  • When is ECM the wrong choice?

  • What are your predictions for the future direction/adoption of ECM or other modeling techniques?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
  • To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Sponsored By:

Support Data Engineering Podcast

  continue reading

483 에피소드

모든 에피소드

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드

탐색하는 동안 이 프로그램을 들어보세요.
재생