Návrh Školení

Week 1 — Introduction to Data Engineering

  • Data engineering fundamentals and modern data stacks
  • Data ingestion patterns and sources
  • Batch vs streaming concepts and use cases
  • Hands-on lab: ingesting sample data into cloud storage

Week 2 — Databricks Lakehouse Foundation Badge

  • Databricks platform fundamentals and workspace navigation
  • Delta Lake concepts: ACID, time travel, and schema evolution
  • Workspace security, access controls, and Unity Catalog basics
  • Hands-on lab: Delta table creation and management

Week 3 — Advanced SQL on Databricks

  • Advanced SQL constructs and window functions at scale
  • Query optimization, explain plans, and cost-aware patterns
  • Materialized views, caching, and performance tuning
  • Hands-on lab: optimizing analytical queries on large datasets

Week 4 — Databricks Certified Developer for Apache Spark (Prep)

  • Spark architecture, RDDs, DataFrames, and Datasets deep dive
  • Key Spark transformations and actions; performance considerations
  • Spark streaming basics and structured streaming patterns
  • Practice exam exercises and hands-on test problems

Week 5 — Introduction to Data Modeling

  • Concepts: dimensional modeling, star/schema design, and normalization
  • Lakehouse modeling vs traditional warehouse approaches
  • Design patterns for analytics-ready datasets
  • Hands-on lab: building consumption-ready tables and views

Week 6 — Introduction to Import Tools & Data Ingestion Automation

  • Connectors and ingestion tools for Databricks (AWS Glue, Data Factory, Kafka)
  • Stream ingestion patterns and micro-batch designs
  • Data validation, quality checks, and schema enforcement
  • Hands-on lab: building resilient ingestion pipelines

Week 7 — Introduction to Git Flow and CI/CD for Data Engineering

  • Git Flow branching strategies and repository organization
  • CI/CD pipelines for notebooks, jobs, and infrastructure as code
  • Testing, linting, and deployment automation for data code
  • Hands-on lab: implement Git-based workflow and automated job deployment

Week 8 — Databricks Certified Data Engineer Associate (Prep) & Data Engineering Patterns

  • Certification topics review and practical exercises
  • Architectural patterns: bronze/silver/gold, CDC, slowly changing dimensions
  • Operational patterns: monitoring, alerting, and lineage
  • Hands-on lab: end-to-end pipeline applying engineering patterns

Week 9 — Introduction to Airflow and Astronomer; Scripting

  • Airflow concepts: DAGs, tasks, operators, and scheduling
  • Astronomer platform overview and orchestration best practices
  • Scripting for automation: Python scripting patterns for data tasks
  • Hands-on lab: orchestrate Databricks jobs with Airflow DAGs

Week 10 — Data Visualization, Tableau, and Customized Final Project

  • Connecting Tableau to Databricks and best practices for BI layers
  • Dashboard design principles and performance-aware visualizations
  • Capstone: customized final project scoping, implementation, and presentation
  • Final presentations, peer review, and instructor feedback

Summary and Next Steps

Požadavky

  • An understanding of basic SQL and data concepts
  • Experience with programming in Python or Scala
  • Familiarity with cloud services and virtual environments

Audience

  • Aspiring and practicing data engineers
  • ETL/BI developers and analytics engineers
  • Data platform and DevOps teams supporting pipelines
 350 hodiny

Počet účastníků


Cena za účastníka

Nadcházející kurzy

Související kategorie