The Race to the Élysée Palace - Analysis of the 2017 French Presidential elections
In this project, we study and analyze the socio-demographic factors that influenced voters during French Presidential Elections. The presidential race has showcased a high level of political and ideological polarization of public opinion between two diverging views. Marine Le Pen, a far-right, anti-immigrant, anti-European Union candidate and an upstart former banker without political office experience, Emmanuel Macron. This study examines the 2017 Presidential French Elections primary vote for Emmanuel Macron & Marine Le Pen.
In this study, we collected, cleaned and aggregated demographics and economics data to infer the winning candidates in the French presidential elections at the level of each town in continental France.
In France, the major political parties are (1) National Front ( Front national: FN) lead by LePen (2) Socialist Party (Parti socialiste: PS) lead by HAMON / Holland (3) The Republicans (Les Républicains: LR) lead by FILLON / Sarkozy (3) Left Front (Front de gauche: FDG) lead by MELENCHON / MELENCHON. In the last presidential elections, the race was between France’s two presidential candidates, Marine Le Pen of the far-right National Front and the centrist former economy minister Emmanuel Macron. As Ms. Le Pen planned to stop immigration and leave the EU, what we found is that Ms. Le Pen was supported in regions with high unemployment and low incomes. Mr. Macron won in big cities, diverse and economically stable regions where most immigrants and educated people supported his plans.
In order to understand how France voted and why, we used demographic and economic data such: population, unemployement, average age at the level of every area, ratio of retired people, ratio of students, Gender ratios, people's prefession at the level of every region, ratio of immigrats and foreigners, ratio of educated people and their academic level, etc.
We use a logistic regression with L2 regularization. We tested the model using 5-fold cross validation for evaluation to prevent over-fitting. The best performing model has an overall precision 0.72, recall of 0.75, F1 score of 0.70 and Accuracy of 0.75. In order to understand the importance of predictors, we use Bagged decision trees like Random Forest and Extra Trees to estimate the importance of features. You can see below the table that describe the importance score of top 10 attributes.
|Ratio (%) of people with higher education||0.0429|
|Ratio (%) of immigrants||0.0312|
|Ratio (%) of foreign-born women||0.0294|
|Ratio (%) of foreign-born men||0.0282|
|Ratio (%) of middle-class workers||0.0265|
|Ratio (%) of foreign-born over 55||0.0262|
|Ratio (%) of foreign-born women under 15||0.0261|
|Ratio (%) of foreign-born over 55||0.0254|
|Ratio (%) of unemployed women between 15 & 64||0.0246|
Our findings indicate that socio-demographics & economic rationality were influential in people's voting behaviour during the last French presidential election .
- French Election results
- The French Open Data Portal
- The National Institute of Statistics and Economic Studies (INSEE)
- Open Street Map
You can find all the data along with a description in this file
How to run Code
- Set up a virtual environment
virtualenv env source env/bin/activate
- Install the required modules list on
pip install -r requirements.txt
- Download socio-demographic data to perform the analysis
python main.py -Run DATA_COLLECTION_TASK
- Clean, filter and join the different datasets
python main.py -Run DATA_PROCESSING_TASK
- Classification task
python main.py -Run DATA_ANALYSIS_TASK
If you are having issues, please let us know or submit a pull request.
The project is licensed under the MIT License.