Apache Cassandra is a highly-scalable partitioned row store. Rows are organized into tables with a required primary key
Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
Kafka Monitor is a framework to implement and execute long-running kafka system tests in a real cluster
An engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time.
Infinispan is an open source data grid platform and highly scalable NoSQL cloud data store.
Apache Geode is a data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures
Universe: a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.
A distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.
Apache Edgent is an open source stream processing programming model and lightweight micro-kernel style runtime for edge devices that enables you to analyze data and events at the device
An open source, distributed bitmap index that dramatically accelerates queries across multiple, massive data sets.
A modular scientific software framework. It provides all the functionalities needed to deal with big data processing, statistical analysis, visualisation and storage. It is mainly written in C++ but integrated with other languages such as Python and R.
DUCC is a cluster management system providing tooling, management, and scheduling facilities to automate the scale-out of applications written to the UIMA framework
A scalable, fault tolerant and low latency storage service optimized for append-only workloads.
The Apache Ignite In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.
Pig is a dataflow programming environment for processing very large files
GridGain’s In-Memory Data Fabric is designed to deliver uncompromised performance for a widest set of in-memory computing
Apache Calcite is a dynamic data management framework.
Lightweight library to build and train neural networks in Theano
A realtime distributed OLAP datastore
Curator is a set of Java libraries that make using Apache ZooKeeper much easier
Federated Big Data Orchestration Service
Parquet-MR contains the java implementation of the Parquet format
Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems
Lightweight real-time big data streaming engine over Akka
Universal data ingestion framework for Hadoop.
Oozie is an extensible, scalable and reliable system to define, manage, schedule, and execute complex Hadoop workloads via web services
Alluxio, formerly Tachyon, A Virtual Distributed Storage at Memory Speed
Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform
Beringei is a high performance, in-memory storage engine for time series data.
SnappyData: OLTP + OLAP Database built on Apache Spark