A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms
Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems
A python package that provides Tensor computation (like numpy) with strong GPU acceleration and Deep Neural Networks built on a tape-based autograd system
Universe: a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.
ClickHouse is a free analytic DBMS for big data.
Federated Big Data Orchestration Service
High performance distributed data processing engine
Deep Learning for humans
Caffe: a fast open framework for deep learning.
Spark is a fast and general cluster computing system for Big Data
Open Source, Distributed, RESTful Search Engine
Let’s Big Data. Hue is an open source Web interface for analyzing data with Hadoop and Spark.
A generic dynamo implementation for different k-v storage engines
Airflow is a platform to programmatically author, schedule and monitor workflows
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
a Map/Reduce framework for distributed computing
Apache Hadoop is a framework for running applications on large cluster built of commodity hardware
Kafka™ is used for building real-time data pipelines and streaming apps
PArallel Distributed Deep LEarning
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable
Vitess is a database clustering system for horizontal scaling of MySQL.
Microsoft Cognitive Toolkit (CNTK)
Apache Solr is a search engine server that uses Apache Lucene
The Koç University deep learning framework implemented in Julia by Deniz Yuret and collaborators. It supports GPU operation and automatic differentiation using dynamic computational graphs for models defined in plain Julia.
Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks
Distributed SQL query engine for big data
Apache Kylin is an open source Distributed Analytics Engine, contributed by eBay Inc., provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets
PlaidML is a framework for making deep learning work everywhere.