data-engineering topic
datahelix
The DataHelix generator allows you to quickly create data, based on a JSON profile that defines fields and the relationships between them, for the purpose of testing and validation
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Data-Engineering-Projects
Personal Data Engineering Projects
geni
A Clojure dataframe library that runs on Spark
etl
PHP - ETL (Extract Transform Load) data processing library
awesome-dbt
A curated list of awesome dbt resources
streamify
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
auptimizer
An automatic ML model optimization tool.
butterfree
A tool for building feature stores.