Peg's Blog: 2017

Thursday, 14 December 2017

Converting a Correlation Matrix to a Covariance Matrix with Lavaan

Researchers are sometimes interested in converting a correlation matrix in to a covariance matrix. Perhaps they happen to use statistical software or an R package that can deal with covariance matrices as input, but not correlation matrices (e.g., the fantastic SEM package Lavaan (Rosseel, 2012), or the first step of the two-step meta-analytic SEM method implemented in the metaSEM package (Cheung, 2015). Or perhaps they are just inexplicably inquisitive and/or enjoy wasting hours performing pointless transformations.

Conveniently, the fantastic Lavaan has a built-in function for converting a correlation matrix in to a covariance matrix (others probably do too, but I don't know about them: forgive me). I'm going to show you how to use that function here because the Lavaan documentation does not show you how to use it in any real detail (as far as I can see).

First, you need to provide the correlation matrix. Simply duplicate the matrix that appears in the paper that you're interested in. For example, the one below from my friend Jennifer's paper on spitefulness and humor styles.

It needs to be symmetric, so you should fill upper and lower triangles of the matrix, in addition to a correlation of 1 for the variable with itself.

Use this code (it looks a lot better in the gist embedded below):

corr1 <- matrix(c( 1, -.47, -.16, -.18, -.26,
-.47, 1, .07, .04, .36,
-.16, .07, 1, -.13, -.10,
-.18, .04, -.13, 1, .19,
-.26, .36, -.10, .19, 1), nrow = 5, ncol = 5, byrow = FALSE)

The highlighted matrix and the code below should make it easier to identify from where the coefficients should be drawn and where they should go. Follow the colours.

Second, you need provide the standard deviations for each column (each variable, item, etc.).

Look at the SDs (above) and put them in the code below (in order!):

corr1sd = c(.65, .62, .68, .68, .66)

Third, you can perform the transformation using Lavaan's cor2cov function:

cov1 <- cor2cov(corr1, sds = corr1sd)

And there you have it. Covariances and variances on the diagonal. You can take this matrix forward in to a package that deals with it, such as metaSEM. Or you can play with it a bit and take in to Lavaan (blog to come on how to transform this matrix so it can be used in Lavaan or metaSEM).

Cheung, M. W. L. (2015). {metaSEM}: An R Package for Meta-Analysis using Structural Equation Modeling. Frontiers in Psychology. http://journal.frontiersin.org/article/10.3389/fpsyg.2014.01521/full

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/

Sunday, 1 October 2017

80% power for p < .05 and p < .005

In the last few months there has been a new round of debate on p values. Redefine statistical significance by Benjamin et al. (2017) kicked things off. In that article the authors argued that "for research communities that continue to rely on null hypothesis significance testing, reducing the P-value threshold for claims of new discoveries to 0.005 is an actionable step that will immediately improve reproducibility" (p. 11). Fisher's arbitrary choice of 0.05 is no longer adequate.

To be clear, Benjamin et al., did state that their recommendations were specifically for "claims of discovery of new effects" (p. 5). But imagine a scenario where 0.005 is the new 0.05, for all research.

What happens to my 80% power sample sizes if I switch to the new alpha level? How much larger will the sample need to be? For one-sided tests of Pearson's r to 2dps, the answers appear in the table below.

In brief, for correlational research, switching from .05 to 0.005 will require you multiply your sample size by around 1.82 (that's the median figure). Stated differently, that's an 82% increase in participants.

r	.05	.005	Increase in N	Multiply N by (2dp)
.01	61824	116785	54961	1.89
.02	15455	29193	13738	1.89
.03	6868	12972	6104	1.89
.04	3862	7295	3433	1.89
.05	2471	4667	2196	1.89
.06	1716	3239	1523	1.89
.07	1260	2379	1119	1.89
.08	964	1820	856	1.89
.09	762	1437	675	1.89

.10	617	1163	546	1.88
.11	509	960	451	1.89
.12	428	806	378	1.88
.13	364	686	322	1.88
.14	314	591	277	1.88
.15	273	514	241	1.88
.16	240	451	211	1.88
.17	212	399	187	1.88
.18	189	356	167	1.88
.19	170	319	149	1.88

.20	153	287	134	1.88
.21	139	260	121	1.87
.22	126	236	110	1.87
.23	115	216	101	1.88
.24	106	198	92	1.87
.25	97	182	85	1.88
.26	90	168	78	1.87
.27	83	155	72	1.87
.28	77	144	67	1.87
.29	72	134	62	1.86

.30	67	125	58	1.87
.31	63	117	54	1.86
.32	59	109	50	1.85
.33	55	102	47	1.85
.34	52	96	44	1.85
.35	49	90	41	1.84
.36	46	85	39	1.85
.37	44	80	36	1.82
.38	41	76	35	1.85
.39	39	72	33	1.85

.40	37	67	30	1.81
.41	35	65	30	1.86
.42	33	61	28	1.85
.43	32	58	26	1.81
.44	30	55	25	1.83
.45	29	53	24	1.83
.46	28	50	22	1.79
.47	26	48	22	1.85
.48	25	46	21	1.84
.49	24	44	20	1.83

.50	23	42	19	1.83
.51	22	40	18	1.82
.52	21	38	17	1.81
.53	20	37	17	1.85
.54	20	35	15	1.75
.55	19	34	15	1.79
.56	18	32	14	1.78
.57	17	31	14	1.82
.58	17	30	13	1.76
.59	16	29	13	1.81

.60	16	27	11	1.69
.61	15	26	11	1.73
.62	14	25	11	1.79
.63	14	24	10	1.71
.64	13	23	10	1.77
.65	13	23	10	1.77
.66	13	22	9	1.69
.67	12	21	9	1.75
.68	12	20	8	1.67
.69	11	19	8	1.73

.70	11	19	8	1.73
.71	11	18	7	1.64
.72	10	17	7	1.70
.73	10	17	7	1.70
.74	10	16	6	1.60
.75	9	16	7	1.78
.76	9	15	6	1.67
.77	9	14	5	1.56
.78	9	14	5	1.56
.79	8	13	5	1.63

.80	8	13	5	1.63
.81	8	12	4	1.50
.82	8	12	4	1.50
.83	7	12	5	1.71
.84	7	11	4	1.57
.85	7	11	4	1.57
.86	7	10	3	1.43
.87	6	10	4	1.67
.88	6	10	4	1.67
.89	6	9	3	1.50

.90	6	9	3	1.50
.91	6	8	2	1.33
.92	6	8	2	1.33
.93	5	8	3	1.60
.94	5	7	2	1.40
.95	5	7	2	1.40
.96	5	7	2	1.40
.97	5	6	1	1.20
.98	—	6
.99	—	5

Note. Sample size information computed using R package pwr, using variations on the following code

pwr.r.test(n = , r = , sig.level = 0.05, power = .80, alternative = "greater")