An Open Source Machine Learning Framework for Everyone
Containerized Data Analytics
Schemaless Stream Processing (Complex Event Processing) Server with SQL
A large-scale entity and relation database supporting very large graphs containing rich, aggregated properties on the nodes and edges. Several storage options are available, including Accumulo, Hbase and Parquet.
Open Source, Distributed, RESTful Search Engine
Deep Learning for humans
Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation
TiDB is a distributed HTAP database compatible with the MySQL protocol
A Hadoop native SQL query engine that combines the key technological advantages of MPP database with the scalability and convenience of Hadoop
A realtime distributed OLAP datastore
Spark is a fast and general cluster computing system for Big Data
Apache Geode is a data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures
A platform for cluster management and resource scheduling for AI that incorporates the mature design with a proven track record in Microsoft's large scale production environment
Vitess is a database clustering system for horizontal scaling of MySQL.
An engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time.
Distributed, masterless, high performance, fault tolerant data processing
Caffe: a fast open framework for deep learning.
Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
A python package that provides Tensor computation (like numpy) with strong GPU acceleration and Deep Neural Networks built on a tape-based autograd system
Airflow is a platform to programmatically author, schedule and monitor workflows
A modular scientific software framework. It provides all the functionalities needed to deal with big data processing, statistical analysis, visualisation and storage. It is mainly written in C++ but integrated with other languages such as Python and R.
A high-performance neural network inference framework optimized for the mobile platform
Distributed training framework for TensorFlow.
Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.
Apache Calcite is a dynamic data management framework.
A framework for training and evaluating AI models on a variety of openly available dialog datasets.
PArallel Distributed Deep LEarning
Beringei is a high performance, in-memory storage engine for time series data.