Artwork

Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.
Player FM -팟 캐스트 앱
Player FM 앱으로 오프라인으로 전환하세요!

Real-Time Data Transformation and Analytics with dbt Labs

43:41
 
공유
 

Manage episode 356056883 series 2355972
Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

dbt is known as being part of the Modern Data Stack for ELT processes. Being in the MDS, dbt Labs believes in having the best of breed for every part of the stack. Oftentimes folks are using an EL tool like Fivetran to pull data from the database into the warehouse, then using dbt to manage the transformations in the warehouse. Analysts can then build dashboards on top of that data, or execute tests.
It’s possible for an analyst to adapt this process for use with a microservice application using Apache Kafka® and the same method to pull batch data out of each and every database; however, in this episode, Amy Chen (Partner Engineering Manager, dbt Labs) tells Kris about a better way forward for analysts willing to adopt the streaming mindset: Reusable pipelines using dbt models that immediately pull events into the warehouse and materialize as materialized views by default.

dbt Labs is the company that makes and maintains dbt. dbt Core is the open-source data transformation framework that allows data teams to operate with software engineering’s best practices. dbt Cloud is the fastest and most reliable way to deploy dbt.
Inside the world of event streaming, there is a push to expand data access beyond the programmers writing the code, and towards everyone involved in the business. Over at dbt Labs they’re attempting something of the reverse— to get data analysts to adopt the best practices of software engineers, and more recently, of streaming programmers. They’re improving the process of building data pipelines while empowering businesses to bring more contributors into the analytics process, with an easy to deploy, easy to maintain platform. It offers version control to analysts who traditionally don’t have access to git, along with the ability to easily automate testing, all in the same place.
In this episode, Kris and Amy explore:

  • How to revolutionize testing for analysts with two of dbt’s core functionalities
  • What streaming in a batch-based analytics world should look like
  • What can be done to improve workflows
  • How to democratize access to data for everyone in the business

EPISODE LINKS

  continue reading

챕터

1. Intro (00:00:00)

2. What is MDS? (00:03:48)

3. What is dbt? (00:08:48)

4. Who uses dbt? (00:10:32)

5. How does someone get started with dbt? (00:14:30)

6. How does dbt fit into the world of streaming? (00:20:44)

7. How can you do unit testing with dbt? (00:24:04)

8. Will batch and streaming always be a part of the solution? (00:26:12)

9. What are event streamers doing wrong? (00:32:54)

10. What are some things to know about data testing with dbt? (00:37:19)

11. What should people be watching for in the industry? (00:40:41)

12. It's a wrap! (00:41:52)

265 에피소드

Artwork
icon공유
 
Manage episode 356056883 series 2355972
Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®에서 제공하는 콘텐츠입니다. 에피소드, 그래픽, 팟캐스트 설명을 포함한 모든 팟캐스트 콘텐츠는 Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® 또는 해당 팟캐스트 플랫폼 파트너가 직접 업로드하고 제공합니다. 누군가가 귀하의 허락 없이 귀하의 저작물을 사용하고 있다고 생각되는 경우 여기에 설명된 절차를 따르실 수 있습니다 https://ko.player.fm/legal.

dbt is known as being part of the Modern Data Stack for ELT processes. Being in the MDS, dbt Labs believes in having the best of breed for every part of the stack. Oftentimes folks are using an EL tool like Fivetran to pull data from the database into the warehouse, then using dbt to manage the transformations in the warehouse. Analysts can then build dashboards on top of that data, or execute tests.
It’s possible for an analyst to adapt this process for use with a microservice application using Apache Kafka® and the same method to pull batch data out of each and every database; however, in this episode, Amy Chen (Partner Engineering Manager, dbt Labs) tells Kris about a better way forward for analysts willing to adopt the streaming mindset: Reusable pipelines using dbt models that immediately pull events into the warehouse and materialize as materialized views by default.

dbt Labs is the company that makes and maintains dbt. dbt Core is the open-source data transformation framework that allows data teams to operate with software engineering’s best practices. dbt Cloud is the fastest and most reliable way to deploy dbt.
Inside the world of event streaming, there is a push to expand data access beyond the programmers writing the code, and towards everyone involved in the business. Over at dbt Labs they’re attempting something of the reverse— to get data analysts to adopt the best practices of software engineers, and more recently, of streaming programmers. They’re improving the process of building data pipelines while empowering businesses to bring more contributors into the analytics process, with an easy to deploy, easy to maintain platform. It offers version control to analysts who traditionally don’t have access to git, along with the ability to easily automate testing, all in the same place.
In this episode, Kris and Amy explore:

  • How to revolutionize testing for analysts with two of dbt’s core functionalities
  • What streaming in a batch-based analytics world should look like
  • What can be done to improve workflows
  • How to democratize access to data for everyone in the business

EPISODE LINKS

  continue reading

챕터

1. Intro (00:00:00)

2. What is MDS? (00:03:48)

3. What is dbt? (00:08:48)

4. Who uses dbt? (00:10:32)

5. How does someone get started with dbt? (00:14:30)

6. How does dbt fit into the world of streaming? (00:20:44)

7. How can you do unit testing with dbt? (00:24:04)

8. Will batch and streaming always be a part of the solution? (00:26:12)

9. What are event streamers doing wrong? (00:32:54)

10. What are some things to know about data testing with dbt? (00:37:19)

11. What should people be watching for in the industry? (00:40:41)

12. It's a wrap! (00:41:52)

265 에피소드

Alle episoder

×
 
Loading …

플레이어 FM에 오신것을 환영합니다!

플레이어 FM은 웹에서 고품질 팟캐스트를 검색하여 지금 바로 즐길 수 있도록 합니다. 최고의 팟캐스트 앱이며 Android, iPhone 및 웹에서도 작동합니다. 장치 간 구독 동기화를 위해 가입하세요.

 

빠른 참조 가이드