The big data serving engine - Store, search, rank and organize big data at user serving time. Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data and executes distributed queries including evaluation of machine-learned models over many data points in real time.
Table of contents
Use cases such as search, recommendation and personalization need to select a subset of data in a large corpus, evaluate machine-learned models over the selected data, organize and aggregate it and return it, typically in less than 100 milliseconds, all while the data corpus is continuously changing.
This is hard to do, especially with large corpuses that needs to be distributed over multiple nodes and evaluated in parallel. Vespa is a platform which performs these operations for you. It has been in development for many years and is used on a number of large internet services and apps which serve hundreds of thousands of queries from Vespa per second.
To get started using Vespa pick one of the quick start documents:
- Run on a Mac or Linux machine using Docker
- Run on a Windows machine using Docker
- Run on a Mac or Linux machine using VirtualBox+Vagrant
- Multinode install on AWS EC2
- Multinode install on AWS ECS
- The application created in the quickstart is fully functional and production ready, but you may want to add more nodes for redundancy.
- Try the Blog search and recommendation tutorial to learn more about using Vespa
- See developing applications on adding your own Java components to your Vespa application.
- Vespa APIs is useful to understand how to interface with Vespa
- Explore the sample applications
Full documentation is available on https://docs.vespa.ai.
We welcome contributions! See CONTRIBUTING.md to learn how to contribute.
If you want to contribute to the documentation, see https://github.com/vespa-engine/documentation
You do not need to build Vespa to use it, but if you want to contribute you need to be able to build the code. This section explains how to build and test Vespa. To understand where to make changes, see Code-map.md. Some suggested improvements with pointers to code are in TODO.md.
Set up the build environment
C++ and Java building is supported on CentOS 7. The Java source can also be built on any platform having Java 11 and Maven installed. We recommend using the following environment: Create C++ / Java dev environment on CentOS using VirtualBox and Vagrant. You can also setup CentOS 7 natively and install the following build dependencies:
sudo yum-config-manager --add-repo https://copr.fedorainfracloud.org/coprs/g/vespa/vespa/repo/epel-7/group_vespa-vespa-epel-7.repo sudo yum -y install epel-release centos-release-scl yum-utils sudo yum -y install ccache \ rpm-build yum-builddep -y <vespa-source>/dist/vespa.spec
Build Java modules
export MAVEN_OPTS="-Xms128m -Xmx1024m" source /opt/rh/rh-maven35/enable bash bootstrap.sh java mvn -T <num-threads> install
Build C++ modules
<build-dir> with the name of the directory in which you'd like to build Vespa.
<source-dir> with the directory in which you've cloned/unpacked the source tree.
bash bootstrap-cpp.sh <source-dir> <build-dir> cd <build-dir> make -j <num-threads> ctest3 -j <num-threads>
Create RPM packages
sh dist.sh VERSION && rpmbuild -ba ~/rpmbuild/SPECS/vespa-VERSION.spec
Code licensed under the Apache 2.0 license. See LICENSE for terms.