The analysis of the 2014 and 2019 Lok Sabha elections by AtliQ Media aimed to present unbiased insights on voter turnout and other themes. Key findings included variations in voter turnout ratios, consistency in party performance, and shifts in vote shares. The analysis also explored potential correlations between voter turnout and factors like postal votes, GDP, and literacy rates. Recommendations to enhance electoral participation included improving voting accessibility, increasing voter education, and removing barriers to voting.

Presentation:

https://youtu.be/o_vNawsj5SI

Problem Statement:

AtliQ Media is a private media company and they wanted to telecast a show on Lok Sabha elections 2024 in India. Unlike other channels they do not want to have a debate on who is going to win this election, they rather wanted to present insights from 2014 and 2019 elections without any bias and discuss less explored themes like voter turnout percentage in India. Peter is a data analyst in the company, and he is handed over this task of generating meaningful insights from data. Since this is a sensitive topic, he help from his manager Tony Sharma who provided the list of primary and secondary questions.

Datasets:

Data Cleaning

  1. Data might contain constituencies spelling mismatches and some constituencies may be listed with identical names. Proper validation is required.
  2. In 2014, Andhra Pradesh underwent bifurcation. For simplicity, all constituencies from that year should be attributed to Telangana state. This includes constituencies such as Adilabad, Hyderabad, Warangal, etc., which should be considered part of Telangana rather than Andhra Pradesh for the year 2014.

Data Transformation

  1. Establish suitable dimension tables and primary keys to link the CSV files effectively.
  2. Construct aggregate tables using append and groupby functions as needed to format the data appropriately for answering the queries.

Kaggle Notebook:

rpc11-loksabha-election.ipynb

RPC11_loksabha_election

Tasks:

Questions from the available data (Primary)

  1. List top 5 bottom 5 constituencies of 2014 and 2019 in terms of voter turnout ratio?
  2. List top 5 bottom 5 states of 2014 and 2019 in terms of voter turnout ratio?