Counting 3,834 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

Last Commit
May. 23, 2019
Sep. 26, 2018


Are you running Elasticsearch? Want to take your data and get the heck outta Dodge? Blaze provides everything you need in a neat, blazing fast package!

Linux / OSX
Build Status



Blaze compared to other Elasticsearch dump tools. The index has ~3.5M rows and is ~5GB in size. Each tool is timed with time and measures the time to write a simple JSON dump file.

Tool Time
Blaze 00m40s
elasticdump 04m38s


Get the binary for your platform from the Releases page or compile it yourself. If you use it often it might make sense to put it in your PATH somewhere.

$ blaze --host=http://localhost:9200 --index=massive_1 > dump.ndjson

This will connect to Elasticsearch on the specified host and start downloading the massive_1 index to stdout. Make sure to redirect this somewhere, such as a JSON file.

Output format

Blaze will dump everything to stdout in a format compatible with the Elasticsearch Bulk API, meaning you can use curl to put the data back.

curl -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/other_data/_bulk --data-binary "@dump.ndjson"

One issue when working with large datasets is that Elasticsearch has an upper limit on the size of HTTP requests (2GB). The solution is to split the file with something like parallel. The split should be done on even line numbers since each command is actually two lines in the file.

cat dump.ndjson | parallel --pipe -l 50000 curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/other_data/_bulk --data-binary "@-"

Command line options

  • --host=<value> - the host where Elasticsearch is running.
  • --index=<value> - the index to dump.
  • --slices=<value> - (optional) the number of slices to split the scroll. Should be set to the number of shards for the index (as seen on /_cat/indices). Defaults to 5.
  • --size=<value> - (optional) the size of the response (i.e, length of the hits array). Defaults to 5000.
  • --dump-mappings - specify this flag to dump the index mappings instead of the source.


To use HTTP Basic authentication you need to pass the following options. Note that passing a password on the command line will put it in your terminal history, so please use with care.

  • --auth=basic - enable HTTP Basic authentication.
  • --basic-username=foo - the username.
  • --basic-password=bar - the password.

Building from source

Building Blaze is easy. It requires libcurl.

On Linux (and OSX)

$ git submodule update --init
$ make


Copyright © Viktor Elofsson and contributors.

Blaze is provided as-is under the MIT license. For more information see LICENSE.