A python package that provides Tensor computation (like numpy) with strong GPU acceleration and Deep Neural Networks built on a tape-based autograd system
Metron integrates a variety of open source big data technologies in order to offer a centralized tool for security monitoring and analysis
Spark is a fast and general cluster computing system for Big Data
Kafka™ is used for building real-time data pipelines and streaming apps
ClickHouse is a free analytic DBMS for big data.
Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation
A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms
Apache Cassandra is a highly-scalable partitioned row store. Rows are organized into tables with a required primary key
Deep Learning for humans
Distributed training framework for TensorFlow.
The Apache Ignite In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.
Alluxio, formerly Tachyon, A Virtual Distributed Storage at Memory Speed
Caffe: a fast open framework for deep learning.
A modular scientific software framework. It provides all the functionalities needed to deal with big data processing, statistical analysis, visualisation and storage. It is mainly written in C++ but integrated with other languages such as Python and R.
Stroom is a highly scalable data storage, processing and analysis platform.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
Apache Solr is a search engine server that uses Apache Lucene
Apache Hadoop is a framework for running applications on large cluster built of commodity hardware
PlaidML is a framework for making deep learning work everywhere.
Microsoft Cognitive Toolkit (CNTK)
A high-performance neural network inference framework optimized for the mobile platform
Airflow is a platform to programmatically author, schedule and monitor workflows
The Apache Hive (TM) data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL
Arrow is a set of technologies that enable big-data systems to process and move data fast
Sqoop allows easy imports and exports of data sets between databases and HDFS
Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities
Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable
Lightweight library to build and train neural networks in Theano
Scalable PostgreSQL for multi-tenant and real-time workloads
A customisable 3D platform for agent-based AI research