data-engineering topic

List data-engineering repositories

pyjanitor

1.3k
Stars
167
Forks
Watchers

Clean APIs for data cleaning. Python implementation of R package Janitor

just-dashboard

1.6k
Stars
73
Forks
Watchers

:bar_chart: :clipboard: Dashboards using YAML or JSON files

conduit

353
Stars
41
Forks
Watchers

Conduit streams data between data stores. Kafka Connect replacement. No JVM required.

around-dataengineering

1.1k
Stars
227
Forks
Watchers

A Data Engineering & Machine Learning Knowledge Hub

cuelake

283
Stars
28
Forks
Watchers

Use SQL to build ELT pipelines on a data lakehouse.

feast

5.3k
Stars
944
Forks
Watchers

The Open Source Feature Store for Machine Learning

versatile-data-kit

413
Stars
54
Forks
Watchers

One framework to develop, deploy and operate data workflows with Python and SQL.

metarank

2.0k
Stars
82
Forks
Watchers

A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine

dataform

796
Stars
148
Forks
Watchers

Dataform is a framework for managing SQL based data operations in BigQuery

data-engineering-zoomcamp

36.5k
Stars
7.4k
Forks
36.5k
Watchers

Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼