Support Vector Machines on Big Five Traits and Politics

This is an example of Support Vector Machines, using one of my usual data sets, as part of a Pluralsight training presentation, Data Mining Algorithms in SSAS, Excel, and R.

In terms of results, the prediction primarily predicts voter leanings based on two (2) traits, openness and conscientiousness, and although using all five (5) factors improved the prediction quality, plotting that is problematic. For this, the model is 96% predictive of Republican outcomes, but only 66% accurate in predicting Democratic leaning.

 Politics.prediction  Blue Red  
                Blue   15   1  
                 Red    5  27  

The code is below, as are some related graphs. Source data is here.
 # Clear memory  
 rm(list = ls())  
 # set Working directory  

 # load data
 Politics.df <- read.csv("BigFiveScoresByState.csv", na.strings = c("", "NA"))  
 # clean data - remove NULLs  
 Politics.df <- na.omit(Politics.df)  
 # explore data #1 - str shows type, more useful in some instances than summary is useful  

 # explore data #2 - simple plots

 # plot openness
 plot.openness <- ggplot(Politics.df, aes(x = Openness, fill = Politics))  
 plot.openness.histo <- plot.openness + geom_histogram(binwidth = 1)  
 plot.openness.histo + scale_fill_manual(values = c("Red" = "red", "Blue" = "blue"))  

# plot conscientiousness
 plot.conscientiousness <- ggplot(Politics.df, aes(x = Conscientiousness, fill = Politics))  
 plot.conscientiousness.histo <- plot.conscientiousness + geom_histogram(binwidth = 1)  
 plot.conscientiousness.histo + scale_fill_manual(values = c("Red" = "red", "Blue" = "blue"))  

# set working data set <- Politics.df  
 # create subsets
 Politics.predictors <-[, 4:6]  
 Politics.predicted <-[, 12]  
 # review subsets - I used this to troubleshoot code issues
 # the svm function needs matrices, and I was incorrectly giving it lists

 # load library for SVM  

 # train
 Politics.svm <- svm(data =, Politics ~ Openness + Conscientiousness)

 # review results 

 # generate predict
 Politics.prediction <- predict(Politics.svm, Politics.predictors)  
 # combine predictions with actual  
 table(Politics.prediction, Politics.predicted)  
 # plot results
 plot(Politics.svm,, Openness ~ Conscientiousness)  


Popular posts from this blog

Cultural Dimensions and Coffee Consumption

Developers in New York City by Zip Code

A Journey — if You Dare — Into the Minds of Silicon Valley Programmers