bootstrapped - confidence intervals made easy
bootstrapped is a Python library that allows you to build confidence intervals from data. This is useful in a variety of contexts - including during ad-hoc a/b test analysis.
bootstrapped - Benefits
- Efficient computation of percentile based confidence intervals
- Functions to handle single populations and a/b test scenarios
- Functions to understand statistical power
import numpy as np import bootstrapped.bootstrap as bs import bootstrapped.stats_functions as bs_stats mean = 100 stdev = 10 population = np.random.normal(loc=mean, scale=stdev, size=50000) # take 1k 'samples' from the larger population samples = population[:1000] print(bs.bootstrap(samples, stat_func=bs_stats.mean)) >> 100.08 (99.46, 100.69) print(bs.bootstrap(samples, stat_func=bs_stats.std)) >> 9.49 (9.92, 10.36)
bootstrapped requires numpy and pandas. The power analysis plotting function requires matplotlib. statsmodels is used in some of the examples.
# clone bootstrapped cd bootstrapped pip install -r requirements.txt python setup.py install
How bootstrapped works
tldr - Percentile based confidence intervals based on bootstrap re-sampling with replacement.
Bootstrapped generates confidence intervals given input data by:
- Generating a large number of samples from the input (re-sampling)
- For each re-sample, calculate the mean (or whatever statistic you care about)
- Of these results, calculate the 2.5th and 97.5 percentiles (default range)
- Use this as the 95% confidence interval
For more information please see:
- Bootstrap confidence intervals (good intro)
- An introduction to Bootstrap Methods
- When the bootstrap dosen't work
- (book) An Introduction to the Bootstrap
- (book) Bootstrap Methods and their Application
See the CONTRIBUTING file for how to help out.
Spencer Beecher, Don van der Drift, David Martin, Lindsay Vass, Sergey Goder, Benedict Lim, and Matt Langner.
Special thanks to Eytan Bakshy.
bootstrapped is BSD-licensed. We also provide an additional patent grant.