Monday 30 January 2017

Zou's (2007) MA Method for Confidence Intervals for the Difference Between two Overlapping Correlation Coefficients Made Easier in R

The Pearson's r quantifies the linear association between two variables. It's hard to imagine personality and social psychology without it. It's quite simple thing, and it emerged in earnest way back in the 1880s. It looks like this* (for a sample):


And you can perform this simple equation for t value:


Sometimes researchers are interested in how Pearson's r correlation coefficients compare. Perhaps they are interested in whether two or more coefficients are different, because their theory says they should be. They probably wont want to do this by simply looking at the two coefficients and seeing whether they are nonidentical. No, that won't fly at the journals, so they'll want to do this by using some kind of statistical procedure.

Until recently, I hadn't realized that were quite so many ways of comparing correlated correlation coefficients (I did a blog on testing heterogeneity if you're interested). According to Meng, Rosenthal & Rubin (1992), Hotelling's t test was the most popular method of comparing two correlated (i.e. overlapping) correlation coefficients until the early 90s. Despite the problems with it, that were noted by Steiger (1980). The psych package contains two tests from that era, the William's test and another of Steiger's (but i'm not sure precisely which). The latter you can do on this cool webpage  made by Ihno Lee and Kristopher Preacher.

But since that time at least two alternative methods have been put forward for use. First, that of Meng, Rosenthal, and Rubin (1992): a Z-test type method "equivalent to Dunn & Clark's test asymptotically but is in a rather simple and thus easy-to-use form" (p. 172).

Second, and the subject of this blog, that of Zou (2007). A technique described by its author as a "modified asymptotic" (MA) method. This one's not so much a test, but a method for creating confidence intervals around the difference between two correlated correlation coefficients. As encouraged by Zou, you can use it in a test-like way by checking whether or not the lower limit includes 0.

I'm not sure if it's implemented anywhere else, but having an R script helps to understand the mechanics of it anyway, so here's how you can do it quickly in R. Download this R script from my dropbox and follow the instructions. The code is set-up for the example given by Zou, but just substitute in your coefficients and 95% confidence intervals for the focal coefficients.

You're best off just following that script, but in short, from the kick off you need to know three things.  The two correlated coefficients you're interested in, r12, r13 (same Xs, different Ys) and the correlation between the two Ys, given as r23. 

r12 <- .396
r13 <- .179
r23 <- .088

From this we need to find Fisher z 95% confidence interval lower and upper bounds around the two correlation coefficients we're interested in. That's four things. We call them: l1, u1 (lower and upper bounds for the first coefficient), l2, and u2 (lower and upper bounds for the second coefficient). This is really easy in R (e.g. r.con(r12, 66, p = .95). Plug them in in this bit:

l1 <- .170
u1 <- .582
l2 <- -.066
u2 <- .404

Then we need to square r12 and r13. And then square and cube r23.

Then we need to find the correlation between the two correlations, which you find with this (which involves some r squares and an r cubed too):


Then we chuck all of that in this, for the lower bound:


And the same for the upper bound (the same thing equation but a little different):



If you do each step sequentially in the R script you'll end up with an upper and lower confidence interval for the difference between two correlated correlation coefficients as presented in Zou (2007). Run through it with the Zou example, and then plug yours in.

If you missed the link above, download the script again here

Zou, G. Y. (2007). Toward using confidence intervals to compare correlations. Psychological Methods12(4), 399-413.

*(equations courtesy of the wikipedia page)


Friday 27 January 2017

Tips for Early-Stage Use of R



As my PhD draws to a close, I've made a commitment to learn how to do all the stats I do in R. This is because (a) the packages R are comprised of are just extremely good pieces of kit and (b) I am worried about the possibility of not being attached to a research institution and therefore having to pay a small fortune for statistical packages. SPSS, for example, is not only a questionable piece of software, but also starts at £949 per year per user. No, there shouldn't be a decimal place in there - you really do have to pay nearly a thousand pound. R, on the other hand, is brilliant and free. At least for now anyway.

I'll admit that I've had a few false starts along the way. For a childish reason that says more about me than it does about R: I've got really really angry when it did something stupid and/or wouldn't give me the thing I wanted it to give me or wouldn't let me do the thing I wanted to do. I think I might have been avoidantly attached in childhood or something. Which means I have shunned R at the first sign of rejection on more than one occasion.

I think at least two of those false starts could have been avoided had someone provided me with some really basic information. The kind of information that they don't really tell you in the books they write about R. Probably because the authors of those books have a level of expertise that hides the annoying little hurdles that the beginner has to hurdle. So they don't have answers to some of those basic questions that you may have.

Here's a few basic answers to a few basic questions.

Q: All I hear about on Twitter is this thing called R. What the hell is R? And why the hell should I use it? I've gotten this far without it, haven't I? 

R is a big scientific calculator that you can download and use on your computer. In itself that's not that impressive, granted, but it has all of these packages that you can download and use within it that are really useful. Think of R as a games console (PS4, XBOX) and the packages as the games you have to put in it. But the thing is you don't have to buy them and each game is a about statistics or data-analysis and they're not an unbelievable waste of time, they're tools and they help you get your work done. Most of the packages allow you to do the same things you did in SPSS or SAS or AMOS or whatever, but most in some sense go above and beyond in the performance of the equivalent statistical procedures. They are also frequently updated. Think of each package, created by a passionate and probably crazed expert on that topic, as a dedicated SPSS to a single or number of statistical procedures. What this means is that each package helps you do something cool with your data or, not your data per se, numbers in general. In a way that you've probably never been able to before.

Want to generate data? Do it in R (e.g. "stats" or "mass" package). Want to run a structural equation model? Do it in R (e.g. "lavaan" package). Want to do an exploratory factor analysis? (do it in "psych"). Want some beautiful plots of your data? Do it in R (e.g. "ggplot2"). Want to do all of these things sequentially, save the code, and then be able to do it again, or have someone else do it, in a couple of months? Do it in R (write a script). Want to do something no one has done before? Make your own code or package. There's a creativity in R that you don't get with the conventional data software, because packages in combination open up infinite possibilities. And think how much time this all saving you in the long run, not needing to go from SPSS to AMOS to LISREL to excel and back again. And the back again the next time you do it.

Add to that the fact (a term here defined as my opinion) that R, particular R studio, is extremely user friendly and aesthetically about as good as any computer data-analysis aid you will find, and it's clear that choosing to get to know R has the potential to be one of the better decisions you'll make this year.

Q. I want to quickly get a feel for it. I have some data. How do I get it in to R?

Easy. Download R. And download RStudio. Hopefully it's already on your computer, it's likely that it is if you're on a computer at a British university. If not, go here https://cran.r-project.org/mirrors.html and choose somewhere near you. Then download the one that works with the operating software you have.

Now open RStudio. First, set your working directory, this is the place on your computer that R will look for when you tell it to load a file in to it. Go to Session -> Set Working Directory -> Choose directory... and choose the folder you want. Perhaps a folder on a memory stick.



Now place a .csv file in the very same folder you have just set as your working directory. If you don't often work with .csv files, consult this page here which tells you what one is and how to save an existing excel file you have as one. In short .csv stands for comma seperated values file. It's a kind of file used widely for the management and manipulation of databases.

Next, get the data from the csv file in to R. Type this in the console: tbl <- read.csv("insertnameoffilehere.csv"). Insert the name of the file exactly as it appears in the working directory (the folder you have told R to look in). The tbl bit is arbitrary, call it whatever you want. That's just a label for your .csv, that you can now use in your code. Press enter.

Now simply type tbl and press enter. It will show you your data. Well, not all of it, but a bit of it. Look at the headers given to each column. Let's pretend that you have a column headed "gpa". To get some simple statistics for this variable type the following.

max(tbl$gpa)
min(tbl$gpa)
mean(tbl$gpa)
median(tbl$gpa)
range(tbl$gpa)

Note that the dollar is there for a reason, not because I am materialistic. So don't forget it!  And if you're British don't get all nationalistic by putting a £ in there instead. Pound sterling just won't do in this context.

You won't need to do this all the time - add that $ - but if you import a .csv and want to get straight to it, absolutely no messing about, the dollar is crucial. In many of the books on R they don't make it clear that this is the case. Mainly because they show you how to perform operations on data that comes preloaded in a package. The analysis of that kind of data does not require this piece of code.

This is all so simple. I do realize that. And you can do this so easily in the software you were using before, I know. But doing this successfully will get you off the ground, giving you a bit of confidence to get really stuck in.

But let's crank it up. Let's get multivariate. Thanks to classical test theory, if you're a psychologist you probably have a multi-item scale in there, yes? If so here's how you test a single-factor model.

Install lavaan. The guide to it can be found here http://lavaan.ugent.be/tutorial/index.html

install.packages("lavaan")

Find the names of the items of the scale. Let's pretend they are V1, V2, V3, V4, V5, V6, V7, V8, V9, V10, change the code below as appropriate (change the Vs to whatever).

#create model
singleF.model <- 'scale =~ V1 + V2 + V3 + V4 + V5 + V6 + V7 + V8 + V9 + V10'
#fit model
fit <-cfa(singleF.model, data = tbl)
#summary of model
summary(fit, fit.measures=TRUE)

Q. What are the different areas of the display in RStudio and what can you do in each?

As standard, RStudio appears with four panes (but you can customize to your own specifications eventually, find out here).



The top right pane is the source. This is the location that imported scripts will appear, as well as the location in which you can create your own. The bottom right is the console. This is where you can execute code (well, you can do that in the source too, but you can't save code in the console).

As standard, the top right pane contains two things. First, "environment", which displays the data sets you have imported, as well as the objects and functions you have created in your current session. Second, "history", which displays a timeline of all the code you have executed during your session.

As standard, the bottom right pane contains:

  • "files". This is where files are displayed in case you want them. As standard, RStudio shows you the files contained in your "my documents" folder.
  • "plots". This is where plots will appear when you make them (e.g. boxplot()).
  • "packages". This is where installed packages are displayed, as well as where they can be loaded via the tick-box interface.
  • "help". This is where you can search all of the commands contained in your packages. A help request (e.g. help("read.csv")) enacted in the console will appear here.

Q. What books should I get? Are there websites or twitter accounts that useful?

Get the R Cookbook, ggplot2: Elegant Graphics for Data Analysis, and Statistics: An Introduction Using R for a kick-off. Don't rush through them, progress slowly for efficient learning.

There are plenty of really useful websites, including www.r-bloggers.com and their associated twitter feed @Rbloggers, which you can go to directly by clicking here. Click here for a long list of useful sites.


Q. I can't get my analyses to run today, even though I did them yesterday. What's going on?

Check to see if the packages are loaded. Check also if the packages are installed. If you have a buggy computer like mine, packages will sometimes just disappear. Reinstall the package if it fails to appear on the packages list multiple times.

Check if the package appears and is ticked on the list:



Q. I've installed a package, I think. It told me it successfully installed, but it won't let me use it. What do I do?

Restart R. If that doesn't work, make sure all packages that it depends on for functioning are installed too. It will probably tell you if they aren't.

Consult your package library to make sure it is there. Your package library is where all the packages are stored (can be your hard-drive or somewhere else). When you load the package by clicking the tick box in the packages tab this will come up on the console: (for example) library("stats", lib.loc="C:/Program Files/R/R-3.3.2/library"). The lib.loc is where the package is stored. Check that place to make sure it's there.

Q. How do I remember all the code that I've used?

Write scripts instead of coding in the console. Save scripts for a particular analysis in the same place as you take the data files from, so you can easily find them again.

Follows these great rules when you code.




So there you have it, a few tips that may help your transition to R a little easier. My old German teacher used to say that learning a new language was not difficult, it was just different. I think the same is true with learning how to use a new statistical interface. And much like learning a language, immersion is the best way to learn. So immerse yourself in R and don't give up when the going gets a little confusing.

Wednesday 25 January 2017

A Trip to Texas for SPSP 2017


I am back. I have jet lag and I need to detox. But I am back.

Back from a trip to the US, all in the name of personality and social psychology. I was there to present a poster on the definition of self-esteem at SPSP2017 (available to download from the open science framework here). The poster is about a meta-analytic investigation of what self-esteem is to personality and social psychologists. The project has taken the best part of two years so far. The short story is that self-esteem is defined in a number of ways, but primarily as one's overall evaluation of their own worth or value. Don't ask me how I got in to this topic--it's a long story--but it’s helped me to understand things a little better and I hope that it helps others too. Some of the feedback I've had on it so far makes me think that it will.

I went to some great talks and had some great conversations with people about their research at the poster sessions. I particularly enjoyed the symposiums or symposia (I don't know) on replication research in grad school, conceptual and empirical issues in narcissism, the one about the future of social and personality psychology, and the alt-ac one about jobs outside the academy. I learned a lot and I got to meet a few of my academic heroes. It was certainly a productive three or four days.

It was also my first time in the US, so I tried to make sure the trip was not just an academic one. I didn’t want my adventure to be all ps, ts, replications, and coffee, so I flew out early and stayed late, in the hope that this would allow me to explore Texas a little bit. I spent the first three days in Houston and the next five in San Antonio.

I was a little bit taken aback by the place if I’m honest. In a really good way. Houston, at least Downtown Houston where I stayed, is a big, bad, ultra-modern urban jungle type place. With more blocks of skyscrapers, high-rise car parks, and sports arenas than you can shake a stick at. The parks around the city are either quirky or grand. Sam Houston Park, to the east, contains old-style Texan buildings, immaculately preserved or restored by the local historic association. Being on the edge of Downtown, they have this surreal metropolitan back drop to the West, which creates a contrast of architecture in the extreme. Hermann Park, to the South, is much larger, and contains a couple of duck-filled lakes, as well as one of those ultra-American reflection pools. Hermann Park is where the dinosaur museum is, which I recommend having a look around. They have four T-Rexs, which is pretty damn impressive. But also complete overkill. T-Rex fossils are not like drinks, you don’t need a few to get you going.

Sam Houston Park

San Antonio, on the other hand, is a different kind of place all together. The weather improved drastically when I arrived in San Antonio, which I think served only to enhance the contrast with Houston. The city has a meandering river (well, I think it’s been modified enough for it really to be called a canal) running right through the middle. It’s banks are covered in bars and restaurants, and the streets above are too. The Alamo is right in the center, and the associated missions (think fortified churches) are dotted along the San Antonio river. I walked a few miles out of town, via the William Street historic district, and then hired a bike to reach the impressive San Jose and Concepcion missions about 10 and 5 miles down the river, respectively. These places are definitely worth a visit, especially if, like me, you had no idea about the interesting history of the area. All in all, San Antonio is a lot more relaxed than Houston, without the imposing blocks of sky-scrapers, and a heavy Mexican and European influence that made me feel like I was in fact in Europe. But with all the good bits of the US thrown in.

Mission San Jose

As a bit of a nature nerd, in both places I was surprised at the diversity and quantity of the wildlife. It’s strange, instead of straining to catch a brief glimpse at a distance of a couple of hundred meters as in the UK, Great Egrets, Cattle Egrets, Cormorants, Herons, and even Vultures will sort of seek your company. It’s very flattering. It’s like they’re human-watching. Which makes your bird-watching a bit of a redundant enterprise. There are turtles all along the San Antonio River too and the Parks in Houston are alive with sound of frogs at the right time of day. You get a sense of it somehow still being the New World, somehow still teaming with life even though it's now teaming with people too.

San Antonio River Walk & Mission Trail

A few of my colleagues from Southampton were in attendance at the conference, but for the vast majority of the trip I was on my own. I was, to get poetic, alone in the lone star state. But really I wasn’t. Despite the testing times in the US at the moment (let’s not get in to that), I couldn’t have felt more warmth.

So, all in all,Texas is pretty much my new favorite place. When do I get to go back?

And for anyone who goes to Texas from the UK I have the following tips.


  •  To “close-out” is to settle the bill or pay for the tab.



  • When you pay on card in a restaurant or bar you get a receipt that you have to sign. You have to print the bill and tip yourself.



  • Entering your pin number is not something that seems to happen often. It’s all about the swipe, even at train stations and other automated ticket machines.



  • In Houston walking around town is not a pleasant stroll. The grid system means you only ever walk about 200 metres before having to cross a 6-lane road.



  • It’s called a crosswalk and you have to wait for the man (who is white, not green) to cross the road. Then there is a countdown until the traffic starts again.



  •  There is not as much “health and safety”. The tram and train lines do not have barriers. The only safe-guard is the drivers reactions. You’re on your own so have your wits about you.



  • Although a lot of Texans (and Americans in general) seem to find the idea of taking the bus utterly horrifying, the bus is really cheap and very comfortable by UK standards ($9 from Houston to San Antonio if booked in advance). The bus stations may be a bit hectic. The bus drivers may actually be nice to you.



  • When you speak to Texans they may initially struggle to understand you. I’ve been told that it’s probably the surprise of the British accent, to which they have to adjust.



  • Don’t use too many very British phrases. The catch-all word ‘alright’ is not so ubiquitous as in the UK. It’s not an adjective, a question, greeting, response to a greeting, and other things here.



  • Texans will speak to you.



  • When they do, they will probably speak their mind. They will let you know when they are happy and when they are not. They may comment on what you do (someone passively-aggressively remarked on my purchase of a $2 bottle of water, which he apparently felt was an absurd amount of money to spend on water).



  • Clothes and food are real cheap. So buy and eat.



  • The food portions are huge. It is not a myth. It took me 40 minutes to eat a Caesar salad. And I eat really fast.



  • The cars are huge. The very smallest cars you will see are BMW 3-Series, which are probably considered quite large cars in the UK.



  •  Usually there a free refills on glasses of Coke and perhaps regular Coffee, so take advantage.



  • There isn’t a Tesco Express on every corner like in the UK. In the Downtown areas it may be slightly difficult to buy food and drink other than from a restaurant.



  • Starbuck’s look and feel exactly the same as back home. So as weird as this sounds, if you’re missing home then go to one.



  •  Free Wi-Fi pretty much everywhere. And good Wi-Fi at that.

Friday 13 January 2017

Getting to the Median (Mean, Range, SD) Intercorrelation of a set of Variables Quickly in R



Sometimes you are interested in the inter-relatedness of many variables. That's pretty much nine tenths of psychometrics.

You may want to use Meng's test of the heterogeneity of correlated coefficients (see this previous blog and this one too if you're interested in that test). Perhaps you have many individuals measured on numerous predictor variables and you want to check whether relations with a shared outcome variable are differential. To do this test you'll need to find the median inter-correlation of the predictor variables. To speak more specifically, the median Pearson's r.

This is a bit of a pain in the backside because most statistical software packages don't have a checkbox for this. Finding the median inter-correlation of a set of variables is not something SPSS, for example, will readily do for you. It's an unusual request.

Below is a simple code for finding the median inter-correlation of an array of variables. I've nicked most of it from this page on the STHDA website, so the credit goes to them. That's a good website, so check it out when you've got a minute.

But, before that, two things.

1) Create a .csv file, containing ONLY the raw data for the variables you are interested in testing the heterogeneity of. So no age, gender, ip address, eye colour, ethnicity, or the shared outcome variable of interest. None of that. Just the columns for the specific variables. So the .csv should only have as many columns as variables. If you don't know how to do that, don't worry it's not difficult, and you can find out here.

2) Install the package Hmisc, which contains many varied and useful functions for data analysis, not just the rcorr function used here.

Then do this



The great thing about doing this way is that next time you have to do it (on another data set collected at a different time) you can just rerun the script.


Tuesday 10 January 2017

Under What Conditions will Meng's test of Correlated Coefficients Come out p <.05?

Meng's test of correlated correlation coefficients (Meng, Rosenthal, & Rubin, 1992) was the focus of my last blog. It looks like this:


Where h  is derived from this:



Which looks a little scary to the uninitiated, but its actually quite a simple thing.

When you plug in all the required values you end up with a chi-square value. To draw conclusions about the heterogeneity of a set of correlated correlation coefficients, you will likely consult the significance of this chi-square value. If p is less than .05, you may well decide that the set of coefficients are heterogeneous.

Imagine a study where you have many individuals measured on many different things. Let's say (1) IQ, (2) GPA, (3) blood pressure, (4) the number of dogs owned, and  (5) how many episodes of Westworld they've watched. In addition you have a shared outcome variable of interest, let's say, procedural learning. What you want to known is whether the linear associations between the 5 predictor variables and the outcome variables are differential. That's where this test might come in handy.

But under what circumstances will you get a p less than .05, when you test the heterogeneity of 5 predictor variables with one outcome variable? What will the correlations "look" like?

A quick simulation might help answer these questions. To be sure, significance, as it always is, will be highly dependent on sample size (all else equal, the higher the sample size the more likely a nice low p). So we need to keep the sample size constant in our simulation. Let's say a sample probably quite typical of a lot of the psychological research I read, n = 200.

Ok so our n is 200. Let's try correlations of exactly the same magnitude as a baseline and a median inter-correlation of an identical figure too (that means the median of the correlations between all five Xs - in this case the 15 correlations which comprise the above or below the diagonal of the correlation matrix of the five variables).We'll go with .50 for all. Which is a large effect size according to another classic paper from the same year (Cohen, 1992).

cors <- c(.50, .50, .50, .50)
mi <- (.50)
n <- 200

The coefficients are the same. Unsurprisingly, we get a chi-square of 0 with df = 4. Which, of course, is insignificant.

Now let's mix it up a bit. Lets have 5 correlation coefficients drawn randomly from a sample with a mean of .50 and a SD of 5. We can do this using this piece of code in R rnorm(5, mean = 50, sd = 5).

cors <- c(.51, .52, .54, .43, .53)
mi <- (.50)
n <- 200

And we get a chi-square of 4.21, again df = 4, which is p = .38. So we're getting there.

Let's crank the SD up to 10. All else constant. rnorm(5, mean = 50, sd = 10).

cors <- c(.49, .51, .60, .41, .45)
mi <- (.50)
n <- 200

And we get a chi-square of 12.26, again df = 4, which is... p = .015. Significant! We're there already. Mission accomplished. With an SD of 10 we've got a highly significant test of heterogeneity. We might then conclude on the basis of this that we have heterogeneous correlated correlation coefficients. In other words (and to return to my not ridiculous example study) the linear associations between IQ, GPA, blood pressure, the number of dogs owned, and the number of episodes of Westworld watched with procedural learning are differential. This is fantastic news if that's what we've hypothesized 😄. Or if we haven't hypothesized that but will do now 😉.

This will probably get very boring very quickly, but lets now look at how the median inter-correlation between the Xs affects our results.

Let's set the median inter-correlation to .30 (medium) for our SD 10 array of coefficients.

cors <- c(.49, .51, .60, .41, .45)
mi <- (.30)
n <- 200

Now that's a chi-square of 9.08, which is p = .059. So now it's marginal. Maybe you'll run with that.

Let's go down a notch to median inter-correlation .20

cors <- c(.49, .51, .60, .41, .45)
mi <- (.20)
n <- 200

Now that's a chi-square of 8.09 which is p = .088. Hold on that's not good.

Take it down one more.

cors <- c(.49, .51, .60, .41, .45)
mi <- (.10)
n <- 200

Oh dear. Chi-square of 7.33, which is p = .119. That's definitely not good 😞.

So what this little exercise shows is that the likelihood of Meng's test of heterogeneity being p less than .05 appears to increase as the standard deviation of the coefficients increases. Which makes perfect sense. And also as the the median inter-correlation of the Xs increases too. Which makes sense also, but I'm trying to get my head around that.

Flipped the other way, the less variance in your coefficients and the less they are themselves interrelated, the less chance you have of a p < .05.

When our Xs are inter-correlated with a large effect size (median r = .50) all we need to have is coefficients with an SD of 10 for a highly significant chi-square. And the chance of a significant chi-square will only increase as sample size increases. With an n of 1000 with our highly significant example (the one above, where we got a p = .001), we get a chi-square of 92.09(!!!), which is just completely off the charts in terms of significance (p < .00001). Now this is important, because in the Meng et al. (1992) publication their example is four obviously heterogeneous coefficients = .63, .53, .54, and - .03 (I mean, what on earth is this guy doing at the end). But what I have shown here is that they don't need to be anywhere as obviously different for a pleasing p.

Here's the code for this little statistical foray:

Meng's Test for the Heterogeneity of Correlated Correlation Coefficients Made Easier in R


Meng's test for the heterogeneity of correlated correlation coefficients (which appears in Meng, Rosenthal, & Rubin, 1992) is potentially of interest to anyone with multiple predictor variables (Xs) and a single outcome variable (Y). That's basically all and sundry. Unfortunately, the test, to my knowledge, is not programmed in any of the common statistical programs used in research psychology (e.g. SPSS, SAS) or R. Meaning that one has no option but to do it by hand (although I have heard it on the internet message board grapevine that someone may have created a MatLab code for it). This just isn't cricket (a game a bit, but not much, like baseball if you're American) in 2017.

Now, I really wanted to use this test in my research. So with necessity being the mother of invention and all, I set about making an R code that makes it easier. After some teething problems, below is that code (slightly rough and ready, but it works). It's an R code for the example test provided in Meng, Rosenthal, & Rubin (1992), following equations 2, 3, and 5 in that paper. It's contingent on the psych package being installed and loaded install.packages("psych"), so make sure you do that. If you put it in to R you will see that you get the same result as Meng et al. This makes me pretty confident it's true to the equations. Modifications to this code will allow you to carry out the test in a fraction of the time that it would take by hand. For now all you need to enter is the collection of correlation coefficients, the median inter-correlation between Xs, and the sample size. I aim to make this automated from a .csv file, but for now you will have to do something yourself. Work the first two things out using your favored data analysis program  (Update: 26/01/17: see my other blog to find out how to do these things quickly). The sample size is something you will know.

Then use a chi-square checker such as this one to check significance level of the chi square the procedure spews out at the end (which will be 5.71 for the Meng example).