Everything you care about in one place

Follow feeds: blogs, news, RSS and more. An effortless way to read and digest content of your choice.

Get Feeder

startdataengineering.com

Start Data Engineering

Get the latest updates from Start Data Engineering directly as they happen.

Follow now 26 followers

Latest posts

Last updated 5 days ago

Data Engineering Interview Series #3: SQL

11 days ago

1. Introduction Every data engineering interview includes a SQL round. If you...

How to Extract Data from APIs for Data Pipelines using Python

about 1 month ago

1. Introduction 2. APIs are a way to communicate between systems on...

How to create an SCD2 Table using MERGE INTO with Spark & Iceberg

about 1 month ago

1. Introduction 1.1. Code and setup 2. MERGE INTO is used to...

How to quickly deliver data to business users? #1. Adv Data types & Schema evolution

about 2 months ago

1. Introduction 1.1. Pre-requisites 2. Use Schema evolution & advanced data types...

How to Manage Upstream Schema Changes in Data Driven Fast Moving Company

3 months ago

1. Introduction 2.Strategies for data teams to handle changing schemas 2.1. Meetings...

20250220

3 months ago

Jupyter notebook magics & SQL magic with DuckDB uv python package manager

Visual Studio Code (VSCode) extensions for data engineers

3 months ago

1. Introduction 2. Python environment setup 3. VSCode Primer 4. Extensions overview...

Should Python Data Pipelines be Function based or Object-Oriented (OOP)?

3 months ago

1. Introduction 2. Data transformations as functions lead to maintainable code 3...

How to turn a 1000-line messy SQL into a modular, & easy-to-maintain data pipeline?

3 months ago

1. Introduction 2. Split your SQL into smaller parts 2.1. Start with...

How to ensure consistent metrics in your warehouse

4 months ago

1. Introduction 2. Centralize Metric Definitions in Code Option A: Semantic Layer...

Data Engineering Interview Series #2: System Design

4 months ago

1. Introduction 2. Guide the interviewer through the process 2.1. [Requirements gathering]...

How to reference a seed from a different dbt project?

5 months ago

1. Introduction 2. Ways to reuse seed data across multiple dbt projects...