I do my own reading for data science, and have my own side projects, but I’ve also taken some data science courses via Pluralsight. The beginner demos are done well, although less informative than the intermediate ones, which are ultimately more useful. For the latter, I typically do simultaneous coding on my own data sets, which helps learn the material.
This post is a demonstration using the caret and nnet packages on aggregate Big Five traits per state and political leanings. Given the small sample size, there was a lower predictive ability using the train/test scenario using 80%, around .66, versus the entire set, which was correct at .83. This is evident in the graphs using lm() and stat_smoot().
The code is below, as are some related graphs. Source data is here.
In terms of results, the regression primarily predicts voter leanings based on five (5) traits, openness, conscientiousness, extraversion, agreeableness, and neuroticism, although only the first two (2) traits have a significant impact. Logistic regression is about 85% predictive, and slightly better at predicting Red states over its ability in predicting Blue states.
 "Logistic Regression (All) - Correct (%) = 0.854166666666667"
 "Logistic Regression (Red) - Correct (%) = 0.892857142857143"
 "Logistic Regression (Blue) - Correct (%) = 0.8"
For the neuralnet package, I created a loop to vary the hidden layers and the number of repetitions, since this is such a small number of records. This is obviously much less predictive than logistic regression, …