data-engineering topic

List data-engineering repositories

Made-With-ML

37.0k
Stars
5.9k
Forks
Watchers

Learn how to design, develop, deploy and iterate on production-grade ML applications.

mlrun

1.3k
Stars
239
Forks
Watchers

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates t...

blaze

914
Stars
87
Forks
Watchers

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

spark-alchemy

179
Stars
33
Forks
Watchers

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive

d6t-python

117
Stars
18
Forks
Watchers

Accelerate data science

datadocs

90
Stars
11
Forks
Watchers

Documentation for data enthusiasts

soorgeon

75
Stars
19
Forks
Watchers

Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊

dataplane

188
Stars
30
Forks
Watchers

Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a...

beneath

81
Stars
9
Forks
Watchers

Beneath is a serverless real-time data platform ⚡️

Movalytics-Data-Warehouse

117
Stars
27
Forks
Watchers

Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow