Counting 3,039 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

autokeras

2536

The ultimate goal of AutoML is to allow domain experts with limited data science or machine learning background easily accessible to deep learning models.

scikit-learn

30005

scikit-learn: machine learning in Python

elasticsearch-hadoop

1308

:elephant: Elasticsearch real-time search and analytics natively integrated with Hadoop

zhusuan

1336

A Library for Bayesian Deep Learning, Generative Models, Based on Tensorflow

sympy

5038

A computer algebra system written in pure Python

fastText

15232

Library for fast text representation and classification.

incubator-singa

1474

Distributed deep learning system

DLTK

651

Deep Learning Toolkit for Medical Image Analysis

distributed

615

Distributed computation in Python

deeplearning4j

9490

Deep Learning for Java, Scala & Clojure on Hadoop & Spark With GPUs - From Skymind

elastic

2827

Elasticsearch client for Go.

sarama

3289

Sarama is a Go library for Apache Kafka 0.8, 0.9, and 0.10.

scipy

4809

SciPy is open-source software for mathematics, science, and engineering

tensor2tensor

4747

A library for generalized sequence to sequence models

auto-sklearn

2461

auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator

pretrained-models.pytorch

1840

Pretrained ConvNets for pytorch: ResNeXt101, ResNet152, InceptionV4, InceptionResnetV2, etc.

brain.js

5800

Neural networks in JavaScript

deepdetect

1635

Deep Learning API and Server in C++11 with Python bindings and support for Caffe, Tensorflow and XGBoost

librdkafka

2159

The Apache Kafka C/C++ library

datasketch

610

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++

text_classification

2553

All kinds of text classification models and more with deep learning

zipline

7492

Zipline, a Pythonic Algorithmic Trading Library

hyperopt

2417

Distributed Asynchronous Hyperparameter Optimization in Python

h2o-3

3321

Open Source Fast Scalable Machine Learning API For Smarter Applications (Deep Learning, Gradient Boosting, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA...)

allennlp

2973

An open-source NLP research library, built on PyTorch.

sparser

159

Sparser: Raw Filtering for Faster Analytics over Raw Data

xlearn

1809

High Performance, Easy-to-use, and Scalable Machine Learning Package

tensorforce

1813

A TensorFlow library for applied reinforcement learning

catboost

3103

CatBoost is an open-source gradient boosting on decision trees library with categorical features support out of the box for Python, R

xgboost

13160

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow