In this one-day workshop, you will learn how to create modern data transformation pipelines managed by dbt and orchestrated with Apache Airflow. You will discover how you can improve your pipelines’ quality and the workflow of your data team by introducing a set of tools aimed to standardize the way you incorporate good practices within the data team: version controlling, testing, monitoring, change-data-capture, and easy scheduling. We will work through typical data transformation problems you can encounter on a journey to deliver fresh & reliable data and how modern tooling can help to solve them. All hands-on exercises will be carried out in a public cloud environment (e.g. GCP or AWS).
During the workshop, participants will follow a shared step-by-step guideline with an overview from the perspective of augmenting a data team’s workflow with the dbt tool. Jupyter Notebook environments will be supplied for each participant. Pre-generated datasets will be provided to use for all participants to participate in the example real-life use case scenario.
Target Audience
Data analysts, analytics engineers & data engineers, who are interested in learning how to build and deploy data transformation workflows faster than ever before. Everyone, who would like to leverage their SQL skills and start working on building data pipelines more easily.
What do you get after the training?
Requirements
Agenda
Session #1 - Introduction to Modern Data Stack
Session #2 - Simple end-do-end data pipeline
Session #3 - Data pipeline - scheduling, deployment & advanced features
Timeline
Maximum number of the attendees
20
Time Box
9.00 - 17.00 | 8h
Session leader: