data-engineering topic
pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
just-dashboard
:bar_chart: :clipboard: Dashboards using YAML or JSON files
conduit
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
around-dataengineering
A Data Engineering & Machine Learning Knowledge Hub
cuelake
Use SQL to build ELT pipelines on a data lakehouse.
feast
The Open Source Feature Store for Machine Learning
versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.
metarank
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
dataform
Dataform is a framework for managing SQL based data operations in BigQuery
data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼