Counting 3,834 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

elasticsearch

41140

Open Source, Distributed, RESTful Search Engine

dynomite

3026

A generic dynamo implementation for different k-v storage engines

tensorflow

127844

An Open Source Machine Learning Framework for Everyone

plaidml

2106

PlaidML is a framework for making deep learning work everywhere.

grakn

1569

A Hyper-Relational Database for Knowledge-Oriented System

horovod

6413

Distributed training framework for TensorFlow.

horovod

6413

Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.

PyTorch

28171

A python package that provides Tensor computation (like numpy) with strong GPU acceleration and Deep Neural Networks built on a tape-based autograd system

flair

6056

A very simple framework for state-of-the-art NLP

yugabyte-db

1104

A transactional, high-performance database for building distributed cloud services. It supports Cassandra-compatible and Redis-compatible APIs, with PostgreSQL in Beta.

incubator-airflow

12273

Airflow is a platform to programmatically author, schedule and monitor workflows

keras

41268

Deep Learning for humans

CNTK

16135

Microsoft Cognitive Toolkit (CNTK)

incubator-impala

287

Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters

ClickHouse

6933

ClickHouse is a free analytic DBMS for big data.

druid

8081

Column oriented distributed data store ideal for powering interactive applications

spark

21866

Spark is a fast and general cluster computing system for Big Data

MNN

1714

MNN is a lightweight deep neural network inference engine.

caffe

28109

Caffe: a fast open framework for deep learning.

LightGBM

8641

A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms

ncnn

6358

A high-performance neural network inference framework optimized for the mobile platform

mace

3253

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

vitess

7984

Vitess is a database clustering system for horizontal scaling of MySQL.

chainer

4790

A flexible framework of neural networks for deep learning

chainer

4790

A flexible framework of neural networks for deep learning

lab

5793

A customisable 3D platform for agent-based AI research

kafka

12138

Kafka™ is used for building real-time data pipelines and streaming apps

lucene-solr

2561

Apache Solr is a search engine server that uses Apache Lucene

zookeeper

6326

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services

storm

5672

Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation