data-processing topic

List data-processing repositories

torcharrow

634
Stars
78
Forks
Watchers

High performance model preprocessing library on PyTorch

scramjet

254
Stars
20
Forks
Watchers

Public tracker for Scramjet Cloud Platform, a platform that bring data from many environments together.

forte

236
Stars
60
Forks
Watchers

Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/

Predict the Power Production of a solar panel farm from Weather Measurements using Machine Learning

blaze

60
Stars
8
Forks
Watchers

A blazing fast exporter for your Elasticsearch data.

Processor

58
Stars
6
Forks
Watchers

Ontology-driven Linked Data processor and server for SPARQL backends. Apache License.

data_processing_course

52
Stars
27
Forks
Watchers

Some class materials for a data processing course using PySpark

pulsar-spark

110
Stars
49
Forks
Watchers

Spark Connector to read and write with Pulsar

incubator-wayang

172
Stars
70
Forks
Watchers

Apache Wayang(incubating) is the first cross-platform data processing system.

blinkist-m4a-downloader

136
Stars
26
Forks
136
Watchers

Grabs all of the audio files from all of the Blinkist books