haotu : an open lab notebook


dplyr column names with spaces, how to index

Filed under: Manipulate Data in R, R, R, R Stats — Tags: , , — S @ 09:12

backtick it!

`my name`




All Pairwise Combinations of Rownames from a Square Matrix

Filed under: R, R, R spatial, R Stats, Uncategorized — Tags: , , , — S @ 12:37
combn(rownames(my.square.matrix), m=2, FUN=paste, collapse = "-", simplify = T)


Make ggplot look like base plot in R

Filed under: ggplot, R, R graphics — Tags: , , , , , , — S @ 04:46
myplot + theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))
myplot + theme_bw() + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))






The relationship between VIF and R2 (r squared)

Filed under: Math and Stats, R, R Stats — Tags: , , — S @ 09:08

Variance Inflation Factor (VIF) is a common simple stat used to quantify multicollinearity in least squares regressions. It is calculated for each covariate in a regression, with higher values meaning that the covariate is more colinear with the other covariates. It technically measures “how much the variance (the square of the estimate’s standard deviation) of an estimated regression coefficient is increased because of collinearity.” The equation is:

{\displaystyle \mathrm {VIF_{i}} ={\frac {1}{1-R_{i}^{2}}}}

where R2i is from the regression of the covariate i on all the other covariates. The problem is where to draw the cutoff? Is a VIF > 2.5 too high? >5? or how about VIF>10, all have been used as cutoffs. Here is a figure of R2 vs VIF. As you can see, a cuttoff of 2.5 is an R2 of 0.60 and 10 is 0.90! While statistically, you could perhaps get away with these high inflations, what does it mean for your particular question? If you are dealing with a relationship among covariates that is as strong as 0.90, can you really be sure that the model and your interpretations are valid?




#VIF function
r<-function(x){1-(1/x)} #r is R2 and x is VIF
x<-seq(1,15,.1) #seq of VIFs
y<-sapply(x,r) #seq of R2
plot(x,y,type="l",xlab="VIF",ylab="R2 of regression of focal covariate on all other covariates")
# common VIF cutoffs = 2.5, 5, 10


make and check directories in R

Filed under: Manipulate Data in R, R, R Stats — Tags: , — S @ 07:32


R packages to help

Filed under: R, R Stats — Tags: , , — S @ 16:11

I just found two R packages that are helpful when searching for functions


Package sos  searches through all R help files etc. and finds relevant functions and packages.


Package ctv links to the taskviews and you can install all the packages of a particular taskview quickly.


Get the R code for the functions of a package

Filed under: Uncategorized — Tags: , , , , , — S @ 03:58

Download the tab ball from Cran and extract the gz to get the tar. Then extract the tar.


I use 7zip since it is free!


Look in the R folder, the code you are searching for will be there.

Blog at WordPress.com.