data-engineering-project
data-engineering-project copied to clipboard
End to end data engineering project with kafka, airflow, spark, postgres and docker.
Hello @HamzaG737 , Thank you for putting together this project to help people get started on data engineering projects. I was following along [this document](https://towardsdatascience.com/end-to-end-data-engineering-system-on-real-data-with-kafka-spark-airflow-postgres-and-docker-a70e18df4090) and when try to setup...
 encountered a "ValueError" due to improper date-time formatting while running the "kafka_data_stream" job in the "kafka_spark_dag" DAG. The error occurs in the "get_all_data" function when attempting to parse a...
Hi Hamza, Thanks a lot for this project. I have a problem with airflow dags. the dag_kafka_spark I don't see it in the dags list in airflow home(8080). any ideas...
## What happened? When running `pip install -r requirements.txt` the following error appears: ERROR: No matching distribution found for apache-airflow==2.7.3 ## My environment Python version: 3.12.4 Pip: 24.0 Anaconda: 24.9.1...
This PR fixes issues related to Kafka and Kafka-UI connections in the `docker-compose.yml` file. - **Fixed:** Incorrect `KAFKA_CFG_ADVERTISED_LISTENERS` causing connection failures. - **Fixed:** Kafka-UI `KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS` misconfiguration. - **Updated:** Replaced deprecated...
I suggest this change because the Bitnami Kafka image is now behind a paywall and no longer publicly available, which prevents CI/CD and local environments from functioning properly. The Soldevelo...