Econometrics, coding, RSTUDIO, presidential elections, USA United States of America, data analysis
This coding exercise consists of one long analysis of data from the 2020 Presidential elections in the United States.
[...] Econometrics: coding on RSTUDIO - US Presidential Elections Loading the Data 1*. elections read.csv("/path/to/countypres_2000-2020.csv") Understanding what the dataset contains 1. tail(elections, 10) 2. table_years table(elections$year) table_mode table(elections$mode) from the years and 2020. The "mode" variable contains various types of voting methods or election modes represents the election results for a specific candidate in a specific county, for a given year and mode of voting number_of_variables ncol(elections) 12 variables number_of_rows nrow(elections) 72,617 rows. Filtering the data and exploring further 1*. elections_2020 filter(elections, year 2020) 2. [...]
[...] reg2: R² is likely higher, indicating additional variance explained by including average income and manufacturing share. reg3: Adds education level; R² shows variance explained by unemployment, income, manufacturing, and education. reg4: Includes demographic characteristics; likely has the highest R², showing variance explained by all included economic, educational, and demographic variables The change in the coefficient for pct_unemp_rate from reg1 to reg2 suggests the following: Significant Change: Indicates that mean_income_thousands and pct_manufac confound the relationship between unemployment rate and Trump's vote share. [...]
[...] candidate_names table(elections_2020$candidate) 2*. elections_2020_clean elections_2020 mutate( candidatevotes = ifelse(is.na(candidatevotes) candidatevotes), totalvotes_county = ifelse( is.na(totalvotes_county), sum(candidatevotes, na.rm = TRUE), totalvotes_county ) ) 3*. elections_2020_clean elections_2020_clean group_by(state, state_po, county_name, county_fips, totalvotes, candidate) summarise(candidatevotes_sum = sum(candidatevotes, na.rm = TRUE)) mutate(pct_votes = (candidatevotes_sum / totalvotes) * 100) group_by(county_fips) mutate(candidate_rank = rank(-candidatevotes_sum)) ungroup() 4. invalid_pct_votes filter(elections_2020_clean, pct_votes 100) invalid_count nrow(invalid_pct_votes) Maps 2. load("path/to/county_shfl.RData") fleming_fips_elections elections_2020_clean filter(toupper(county_name) "FLEMING") select(county_fips) unique() fleming_fips_shfl county_shfl filter(toupper(NAME) "FLEMING") select(STATEFP, COUNTYFP, GEOID) FIPS code for "FLEMING" county is typeof(elections_2020_clean$county_fips) typeof(county_shfl$GEOID) float 4. [...]
[...] In each regression model, the coefficient for pct_unemp_rate reflects the expected change in Trump's vote share for each one-percentage-point change in the unemployment rate, with varying levels of control for other factors: reg1: Shows change without controlling for other factors. reg2: Adjusts for economic variables like average income and manufacturing share. reg3: Further includes control for the level of higher education. reg4: Additionally accounts for demographic characteristics like race, age, gender, and county size In reg4, each coefficient shows how Trump's vote share percentage is expected to change with a unit change in each variable, controlling for other factors: pct_unemp_rate, mean_income_thousands, pct_manufac, pct_bachelor, pct_afam,pct_age_above_62,pct_female and county_size_thousands 10. [...]
[...] acs_2019_covariates acs_2019_covariates mutate(county_size_thousands = (n_hhs * avgsize_hhs) / 1000, mean_income_thousands = mean_income / 1000) 3 acs_2019_covariates acs_2019_covariates select(NAME, GEO_ID, pct_unemp_rate, mean_income_thousands, pct_manufac, pct_bachelor, pct_afam, pct_age_above_62, pct_female, county_size_thousands) 4. geo_id_fleming_county acs_2019_covariates filter(NAME "Fleming County, Kentucky") pull(GEO_ID) the value is '0500000US21069' 5*. acs_2019_covariates acs_2019_covariates mutate(county_fips_clean = substr(GEO_ID 6. elections_2020_covariates left_join(elections_2020_clean, acs_2019_covariates, by = "common_column_name") Regression analysis 3. elections_2020_reg_trump elections_2020_covariates filter(candidate "Donald Trump") select(state, state_po, county_name, county_fips_clean, pct_votes, pct_unemp_rate, mean_income_thousands, pct_manufac, pct_bachelor, pct_afam, pct_age_above_62, pct_female, county_size_thousands) rename(pct_votes_trump = pct_votes) 2. load("pre_cleaned_dataset_for_regression_analysis/elections_2020_reg_trump.RData") skim(elections_2020_reg_trump, pct_votes_trump, pct_unemp_rate, mean_income_thousands, pct_manufac, pct_bachelor, pct_afam, pct_age_above_62, pct_female, county_size_thousands) 3. [...]
APA Style reference
For your bibliographyOnline reading
with our online readerContent validated
by our reading committee