Skip to main content

Patents Per Capita and Hofstede's Cultural Dimensions

Thinking about social dimensions and innovation, it occurred to me that there might be relationship with masculinity, but then quickly dismissed it, considering it much more likely to be predicated on science/math education. Even then, other cultural elements might be more likely correlated. What follows is an exploration of various correlations with patents per capita.

Although Hofstede's Cultural Dimensions did have significant correlation with patents per capita, somewhat surprisingly, PISA scores by country, education, nor average IQ, had a strong relationship with patent production, although if Asia was included, statistically it would.

Notes:
  • I often exclude Asia from analyses, as the initial driver of this work was looking at cultures that are similar, to tease out social effects.
  • That is also why I ignore looking at all countries, as some relationships across the entire world disappear when limited to just developed economies. As an example, the value of work and its benefits to social welfare are quite high when considered globally, but are almost non-existent, within my study parameters, for the developed world.
  • This looks at only patents per capita, to find effects unrelated to shear size, or which countries like the US and Japan dominate.
An explanation of dimensions can be found on Hofstede's site, and source data is here.

Loading Data Vectors

In retrospect I could have done this in a matrix, and maybe should have, but I am just getting back into coding, and did not start this as a clearl project, but as a simple exploration.


 # LM - Multiple Regression - Patents  
 # Clear workspace  
 rm(list = ls())  

 # Set working directory  
 setwd("../Data")  
 oecdData <- read.table("OECD - Quality of Life - Minus Asia.csv", header = TRUE, sep = ",")  
 print(names(oecdData))  

 # Set Vectors  
 # Primary focus  
 vPatents <- oecdData$Patents  
 vPatentsPerCapita <- oecdData$PatentsPerCapita  

 # Hofstede  
 vPowerDx <- oecdData$HofstederPowerDx  
 vMasculinity <- oecdData$HofstederMasculinity  
 vIndividuality <- oecdData$HofstederIndividuality  
 vUncertaintyAvoidance <- oecdData$HofstederUncertaintyAvoidance  
 vLTO <- oecdData$HofstederLongtermOrientation  
 vIndulgence <- oecdData$HofstederIndulgence  

 # Education  
 vPisaScience <- oecdData$PISAScience  
 vPisaMath <- oecdData$PISAMath  
 vPisaReading <- oecdData$PISAReading  
 vIQ <- oecdData$IQ  
 vTertiaryEdu <- oecdData$TertiaryEdu  
 vEduReading <- oecdData$EduReading  
 vEduScience <- oecdData$EduScience  

 # Social welfare  
 vGini <- oecdData$Gini  
 vLifeExpectancy <- oecdData$LifeExpectancy  
 vObesity <- oecdData$Obesity  
 vInfantDeath <- oecdData$InfantDeath  
 vHoursWorked <- oecdData$HoursWorked  


Cultural Dimensions and Patents per Capita

The two (2) dimensions with the highest correlations, and probability under .01, were Uncertainty Avoidance and Individuality. Translated into common speech, cultures that tolerate ambiguity and are least rule-based, along with having high individuality, produce large number of patents.


 # Patents per Capita ~ Hofstede  
 Hofstede_PatentsPerCapita <- lm(vPatentsPerCapita ~ vPowerDx + vIndividuality + vMasculinity + vUncertaintyAvoidance + vLTO + vIndulgence)  
 print(Hofstede_PatentsPerCapita)  
 print(summary(Hofstede_PatentsPerCapita))  
 print(anova(Hofstede_PatentsPerCapita))  

 cor.test(vPatentsPerCapita, vIndividuality)  
 plot(vPatentsPerCapita, vIndividuality, col = "blue", main = "vPatentsPerCapita ~ vIndividuality", abline(lm(vIndividuality ~ vPatentsPerCapita)), cex = 1.3, pch = 16, xlab = "Patents per Capita", ylab = "Individuality")  


cor.test(vPatentsPerCapita, vUncertaintyAvoidance)  
plot(vPatentsPerCapita, vUncertaintyAvoidance, col = "blue", main = "vPatentsPerCapita ~ vUncertaintyAvoidance", abline(lm(vUncertaintyAvoidance ~ vPatentsPerCapita)), cex = 1.3, pch = 16, xlab = "Patents per Capita", ylab = "Uncertainty Avoidance")  




Popular posts from this blog

Decision Tree in R, with Graphs: Predicting State Politics from Big Five Traits

This was a continuation of prior explorations, logistic regression predicting Red/Blue state dichotomy by income or by personality. This uses the same five personality dimensions, but instead builds a decision tree. Of the Big Five traits, only two were found to useful in the decision tree, conscientiousness and openness.

Links to sample data, as well as to source references, are at the end of this entry.

Example Code

# Decision Tree - Big Five and Politics library("rpart") # grow tree input.dat <- read.table("BigFiveScoresByState.csv", header = TRUE, sep = ",") fit <- rpart(Liberal ~ Openness + Conscientiousness + Neuroticism + Extraversion + Agreeableness, data = input.dat, method="poisson") # display the results printcp(fit) # visualize cross-validation results plotcp(fit) # detailed summary of splits summary(fit) # plot tree plot(fit, uniform = TRUE, main = "Classific…

Chi-Square in R on by State Politics (Red/Blue) and Income (Higher/Lower)

This is a significant result, but instead of a logistic regression looking at the income average per state and the likelihood of being a Democratic state, it uses Chi-Square. Interpreting this is pretty straightforward, in that liberal states typically have cities and people that earn more money. When using adjusted incomes, by cost of living, this difference disappears.

Example Code
# R - Chi Square rm(list = ls()) stateData <- read.table("CostByStateAndSalary.csv", header = TRUE, sep = ",") # Create vectors affluence.median <- median(stateData$Y2014, na.rm = TRUE) affluence.v <- ifelse(stateData$Y2014 > affluence.median, 1, 0) liberal.v <- stateData$Liberal # Solve pol.Data = table(liberal.v, affluence.v) result <- chisq.test(pol.Data) print(result) print(pol.Data)
Example Results
Pearson's Chi-squared test with Yates' continuity correction data: pol.Data X-squared = 12.672, df …

Mean Median, and Mode with R, using Country-level IQ Estimates

Reusing the code posted for Correlations within with Hofstede's Cultural Values, Diversity, GINI, and IQ, the same data can be used for mean, median, and mode. Additionally, the summary function will return values in addition to mean and median, Min, Max, and quartile values:

Example Code
oecdData <- read.table("OECD - Quality of Life.csv", header = TRUE, sep = ",") v1 <- oecdData$IQ # Mean with na.rm = TRUE removed NULL avalues mean(v1, na.rm = TRUE) # Median with na.rm = TRUE removed NULL values median(v1, na.rm = TRUE) # Returns the same data as mean and median, but also includes distribution values: # Min, Quartiles, and Max summary(v1) # Mode does not exist in R, so we need to create a function getmode <- function(v) { uniqv <- unique(v) uniqv[which.max(tabulate(match(v, uniqv)))] } #returns the mode getmode(v1)
Example Results
> oecdData <- read.table("OECD - Quality of L…