Friday 13 January 2017

Getting to the Median (Mean, Range, SD) Intercorrelation of a set of Variables Quickly in R



Sometimes you are interested in the inter-relatedness of many variables. That's pretty much nine tenths of psychometrics.

You may want to use Meng's test of the heterogeneity of correlated coefficients (see this previous blog and this one too if you're interested in that test). Perhaps you have many individuals measured on numerous predictor variables and you want to check whether relations with a shared outcome variable are differential. To do this test you'll need to find the median inter-correlation of the predictor variables. To speak more specifically, the median Pearson's r.

This is a bit of a pain in the backside because most statistical software packages don't have a checkbox for this. Finding the median inter-correlation of a set of variables is not something SPSS, for example, will readily do for you. It's an unusual request.

Below is a simple code for finding the median inter-correlation of an array of variables. I've nicked most of it from this page on the STHDA website, so the credit goes to them. That's a good website, so check it out when you've got a minute.

But, before that, two things.

1) Create a .csv file, containing ONLY the raw data for the variables you are interested in testing the heterogeneity of. So no age, gender, ip address, eye colour, ethnicity, or the shared outcome variable of interest. None of that. Just the columns for the specific variables. So the .csv should only have as many columns as variables. If you don't know how to do that, don't worry it's not difficult, and you can find out here.

2) Install the package Hmisc, which contains many varied and useful functions for data analysis, not just the rcorr function used here.

Then do this



The great thing about doing this way is that next time you have to do it (on another data set collected at a different time) you can just rerun the script.


No comments:

Post a Comment