Counting 2,412 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

elastalert

4142

Easy & Flexible Alerting With ElasticSearch

zipline

6502

Zipline, a Pythonic Algorithmic Trading Library

fastText

12890

Library for fast text representation and classification.

deeplearning4j

8357

Deep Learning for Java, Scala & Clojure on Hadoop & Spark With GPUs - From Skymind

face-recognition.js

444

Simple Node.js API for robust face detection and face recognition.

opencv4nodejs

942

Asynchronous OpenCV 3.x Binding for node.js

luigi

8676

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

warp-ctc

3138

Fast parallel CTC.

nifi

823

Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data

opensoc

421

OpenSOC Apache Hadoop Code

elasticsearch

28966

Open Source, Distributed, RESTful Search Engine

mlpack

1988

A scalable machine learning library, written in C++, that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms

FastPhotoStyle

5089

An implementation of Nvidia's fast photorealistic style transfer algorithm. Given a content photo and a style photo, the code can transfer the style of the style photo to the content photo.

keras

25901

Deep Learning for humans

Age-Gender-Estimate-TF

150

Face age and gender estimate using TensorFlow

umap

827

Uniform Manifold Approximation and Projection

tensorboard-pytorch

1453

Tensorboard for PyTorch

NeuralKart

562

A Real-time Mario Kart AI using CNNs, Offline Search, and DAGGER

hue

2727

Let’s Big Data. Hue is an open source Web interface for analyzing data with Hadoop and Spark.

xgboost

10860

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow

Tensorflow-Project-Template

1511

A best practice for tensorflow project template architecture.

ann-benchmarks

381

Benchmarks of approximate nearest neighbor libraries in Python

featuretools

777

A python library for automated feature engineering

pretrained.ml

338

Compilation of pre-trained deep learning models with demos and code.

DeepSpeech

5856

A TensorFlow implementation of Baidu's DeepSpeech architecture

incubator-predictionio

11084

PredictionIO, a machine learning server for developers and ML engineers. Built on Apache Spark, HBase and Spray.

prisma

6176

Prisma turns your database into a realtime GraphQL API

spark

16259

Spark is a fast and general cluster computing system for Big Data

images-to-osm

265

Use TensorFlow, Bing, and OSM to find features in satellite images for fun.

hadoop

5851

Apache Hadoop is a framework for running applications on large cluster built of commodity hardware