data-engineering topic

List data-engineering repositories

datahelix

140
Stars
50
Forks
Watchers

The DataHelix generator allows you to quickly create data, based on a JSON profile that defines fields and the relationships between them, for the purpose of testing and validation

soda-core

1.9k
Stars
208
Forks
Watchers

:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

geni

278
Stars
28
Forks
Watchers

A Clojure dataframe library that runs on Spark

etl

340
Stars
21
Forks
Watchers

PHP - ETL (Extract Transform Load) data processing library

awesome-dbt

943
Stars
95
Forks
Watchers

A curated list of awesome dbt resources

streamify

494
Stars
108
Forks
Watchers

A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!

dbt-sugar

149
Stars
20
Forks
Watchers

dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models

butterfree

270
Stars
35
Forks
Watchers

A tool for building feature stores.