Counting 2,409 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

prisma

6116

Prisma turns your database into a realtime GraphQL API

Mobius

715

C# and F# language binding and extensions to Apache Spark

luigi

8654

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

TensorComprehensions

752

A domain specific language to express machine learning workloads.

opencv4nodejs

926

Asynchronous OpenCV 3.x Binding for node.js

edward

3307

A library for probabilistic modeling, inference, and criticism. Deep generative models, variational inference. Runs on TensorFlow.

thrift

4434

The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between multiple languages

vitess

5572

Vitess is a database clustering system for horizontal scaling of MySQL.

faceswap

3778

Non official project based on original /r/Deepfakes thread

moviebox

144

Machine learning movie recommender

lorca

21

Natural Language Processing for Spanish in Javascript. Stemmer, sentiment analysis, readability, tf-idf with batteries, concordance and more!

xgboost

10836

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow

nbgrader

442

A system for assigning and grading notebooks

NPMT

95

Towards Neural Phrase-based Machine Translation

Tensorflow-Project-Template

1474

A best practice for tensorflow project template architecture.

DeepJ

352

A model for style-specific music generation

elasticsearch

28875

Open Source, Distributed, RESTful Search Engine

dash

4083

Interactive, Reactive Web Apps for Python. Dash Is Productive™

dask

2459

Versatile parallel programming with task scheduling

pyro

2893

Deep universal probabilistic programming with Python and PyTorch

go-deep

158

Feed forward/back propagation neural network implementation

keras

25751

Deep Learning for humans

pytorch-cnn-finetune

109

Fine-tune pretrained Convolutional Neural Networks with PyTorch

dl4j-examples

799

Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)

caffe

22870

Caffe: a fast open framework for deep learning.

spark-nlp

252

Natural Language Understanding Library for Apache Spark with simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.

incubator-predictionio

11073

PredictionIO, a machine learning server for developers and ML engineers. Built on Apache Spark, HBase and Spray.

Theano

7876

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

billboard.js

2368

Re-usable, easy interface JavaScript chart library based on D3 v4+

DeepSpeech

5804

A TensorFlow implementation of Baidu's DeepSpeech architecture