data-engineering-project icon indicating copy to clipboard operation
data-engineering-project copied to clipboard

End to end data engineering project with kafka, airflow, spark, postgres and docker.

Results 7 data-engineering-project issues
Sort by recently updated
recently updated
newest added

Hello @HamzaG737 , Thank you for putting together this project to help people get started on data engineering projects. I was following along [this document](https://towardsdatascience.com/end-to-end-data-engineering-system-on-real-data-with-kafka-spark-airflow-postgres-and-docker-a70e18df4090) and when try to setup...

![image](https://github.com/user-attachments/assets/7863e298-2f73-4f24-bb67-f2e187877a78) encountered a "ValueError" due to improper date-time formatting while running the "kafka_data_stream" job in the "kafka_spark_dag" DAG. The error occurs in the "get_all_data" function when attempting to parse a...

Hi Hamza, Thanks a lot for this project. I have a problem with airflow dags. the dag_kafka_spark I don't see it in the dags list in airflow home(8080). any ideas...

## What happened? When running `pip install -r requirements.txt` the following error appears: ERROR: No matching distribution found for apache-airflow==2.7.3 ## My environment Python version: 3.12.4 Pip: 24.0 Anaconda: 24.9.1...

This PR fixes issues related to Kafka and Kafka-UI connections in the `docker-compose.yml` file. - **Fixed:** Incorrect `KAFKA_CFG_ADVERTISED_LISTENERS` causing connection failures. - **Fixed:** Kafka-UI `KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS` misconfiguration. - **Updated:** Replaced deprecated...

I suggest this change because the Bitnami Kafka image is now behind a paywall and no longer publicly available, which prevents CI/CD and local environments from functioning properly. The Soldevelo...