data-processing topic
torcharrow
High performance model preprocessing library on PyTorch
scramjet
Public tracker for Scramjet Cloud Platform, a platform that bring data from many environments together.
forte
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/
Machine-Learning-for-Solar-Energy-Prediction
Predict the Power Production of a solar panel farm from Weather Measurements using Machine Learning
blaze
A blazing fast exporter for your Elasticsearch data.
Processor
Ontology-driven Linked Data processor and server for SPARQL backends. Apache License.
data_processing_course
Some class materials for a data processing course using PySpark
pulsar-spark
Spark Connector to read and write with Pulsar
incubator-wayang
Apache Wayang(incubating) is the first cross-platform data processing system.
blinkist-m4a-downloader
Grabs all of the audio files from all of the Blinkist books