Counting 1,778 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

Election Transparency

Slack: #election-transparency

Project Leads: @chris_dick, @rachelanddata, @scottcame

Maintainers (people with write/commit access):

  • GitHub: @scottcame, @rachelanddata, @eric_bickel, @chris_dick
  • @scottcame, @sharon, @chris_dick

Data: Head over to to check out our most up-to-date data!

Project Description: This project analyzes elections in an effort to identify trends, outliers, and/or anomalies to enable insight and transparency into the democratic voting process.

There are several related veins of work going on in the group at the current time:

  • Data Collection, Cleaning, and Joining
  • Data Visualization
  • Modelling and Outlier Detection

For more on the objectives of the group, take a look at our objectives statement.

Getting Started

Want to Contribute?

  • "First-timers" are welcome! Whether you're trying to learn data science, hone your coding skills, or get started collaborating over the web, we're happy to help. If you have any questions feel free to pose them on our slack channel, or reach out to one of the team leads. If you have questions about Git and GitHub specifically, our github-playground repo and the #github-help Slack channel are good places to start.
  • Feeling Comfortable with GitHub, and Ready to Dig In? Check out our GitHub issues. This is our official listing of the work that we are planning to get done. As we add more issues, the maintainers will make sure to specifically tag those issues that are good for beginners with: beginner-friendly
  • Code Reviews: All commits to this repository are reviewed by either a team lead or maintainer. The goal of these reviews is two-fold: We want to make sure we have a high quality product, but we also want to make sure that we all learn from one another. These reviews will allow both the submitter and the reviewer a chance to do just that.
  • This README is a Living Document: If you see something you think should be changed feel free to edit and submit a Pull Request. Not only will this be a huge help to the group, it is also a great first PR!
  • Got an Idea for Something We Should be Working On? You can submit an issue on our GitHub page, mention your idea on the slack channel, or reach out to one of the project leads.

Want to start exploring the data?

Right now, we have two datasets. One contains voter registration data, including party affiliation in states that allow it, as of the 2016 general election in the United States. (We plan to add historical data as well--a great opportunity to contribute if you're interested!) The second dataset contains results from the 2016 general election for President. Both datasets are at the county level.

You can access the datasets in a couple of different ways. They are on at, as comma-separated value (csv) files. Or, you can install the R package, and access the datasets as R data frames. See for more about the R package and data frames.


The following is a non-exhaustive list of the skills that are useful for this project:

  • R: The code that transforms raw voter registration and election results data is in an R package. So if you're interested in enhancing that, or working with source data, R skills are beneficial.
  • Python
  • Data Extraction
  • Data Cleaning
  • Data Analysis and Modelling

Special thanks to the drug-spending team for writing such a great README that we had to borrow liberally from it