Thursday, 14 December 2017

Converting a Correlation Matrix to a Covariance Matrix with Lavaan


Researchers are sometimes interested in converting a correlation matrix in to a covariance matrix.  Perhaps they happen to use statistical software or an R package that can deal with covariance matrices as input, but not correlation matrices (e.g., the fantastic SEM package Lavaan (Rosseel, 2012), or the first step of the two-step meta-analytic SEM method implemented in the metaSEM package (Cheung, 2015). Or perhaps they are just inexplicably inquisitive and/or enjoy wasting hours performing pointless transformations.

Conveniently, the fantastic Lavaan has a built-in function for converting a correlation matrix in to a covariance matrix (others probably do too, but I don't know about them: forgive me). I'm going to show you how to use that function here because the Lavaan documentation does not show you how to use it in any real detail (as far as I can see).

First, you need to provide the correlation matrix. Simply duplicate the matrix that appears in the paper that you're interested in. For example, the one below from my friend Jennifer's paper on spitefulness and humor styles.



It needs to be symmetric, so you should fill upper and lower triangles of the matrix, in addition to a correlation of 1 for the variable with itself.

Use this code (it looks a lot better in the gist embedded below):

corr1 <- matrix(c(  1, -.47, -.16, -.18, -.26,
                    -.47,  1,   .07,  .04,  .36,
                    -.16, .07,   1,  -.13, -.10,
                    -.18, .04, -.13,   1,   .19,
                    -.26, .36, -.10,  .19,   1), nrow = 5, ncol = 5, byrow = FALSE)


The highlighted matrix and the code below should make it easier to identify from where the coefficients should be drawn and where they should go. Follow the colours.






Second, you need provide the standard deviations for each column (each variable, item, etc.).



Look at the SDs (above) and put them in the code below (in order!):

corr1sd = c(.65, .62, .68, .68, .66)



Third, you can perform the transformation using Lavaan's cor2cov function:

cov1 <- cor2cov(corr1, sds = corr1sd)



And there you have it. Covariances and variances on the diagonal. You can take this matrix forward in to a package that deals with it, such as metaSEM. Or you can play with it a bit and take in to Lavaan (blog to come on how to transform this matrix so it can be used in Lavaan or metaSEM).




Cheung, M. W. L. (2015). {metaSEM}: An R Package for Meta-Analysis using Structural Equation Modeling. Frontiers in Psychology. http://journal.frontiersin.org/article/10.3389/fpsyg.2014.01521/full 

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/

Sunday, 1 October 2017

80% power for p < .05 and p < .005

In the last few months there has been a new round of debate on p values. Redefine statistical significance by Benjamin et al. (2017) kicked things off. In that article the authors argued that "for research communities that continue to rely on null hypothesis significance testing, reducing the P-value threshold for claims of new discoveries to 0.005 is an actionable step that will immediately improve reproducibility" (p. 11). Fisher's arbitrary choice of 0.05 is no longer adequate.

To be clear, Benjamin et al., did state that their recommendations were specifically for "claims of discovery of new effects" (p. 5). But imagine a scenario where 0.005 is the new 0.05, for all research.

What happens to my 80% power sample sizes if I switch to the new alpha level? How much larger will the sample need to be? For one-sided tests of Pearson's r to 2dps, the answers appear in the table below.

In brief, for correlational research, switching from .05 to 0.005 will require you multiply your sample size by around 1.82 (that's the median figure). Stated differently, that's an 82% increase in participants.

r
.05
.005
Increase in N
Multiply N by (2dp)
.01
61824
116785
54961
1.89
.02
15455
29193
13738
1.89
.03
6868
12972
6104
1.89
.04
3862
7295
3433
1.89
.05
2471
4667
2196
1.89
.06
1716
3239
1523
1.89
.07
1260
2379
1119
1.89
.08
964
1820
856
1.89
.09
762
1437
675
1.89





.10
617
1163
546
1.88
.11
509
960
451
1.89
.12
428
806
378
1.88
.13
364
686
322
1.88
.14
314
591
277
1.88
.15
273
514
241
1.88
.16
240
451
211
1.88
.17
212
399
187
1.88
.18
189
356
167
1.88
.19
170
319
149
1.88





.20
153
287
134
1.88
.21
139
260
121
1.87
.22
126
236
110
1.87
.23
115
216
101
1.88
.24
106
198
92
1.87
.25
97
182
85
1.88
.26
90
168
78
1.87
.27
83
155
72
1.87
.28
77
144
67
1.87
.29
72
134
62
1.86





.30
67
125
58
1.87
.31
63
117
54
1.86
.32
59
109
50
1.85
.33
55
102
47
1.85
.34
52
96
44
1.85
.35
49
90
41
1.84
.36
46
85
39
1.85
.37
44
80
36
1.82
.38
41
76
35
1.85
.39
39
72
33
1.85





.40
37
67
30
1.81
.41
35
65
30
1.86
.42
33
61
28
1.85
.43
32
58
26
1.81
.44
30
55
25
1.83
.45
29
53
24
1.83
.46
28
50
22
1.79
.47
26
48
22
1.85
.48
25
46
21
1.84
.49
24
44
20
1.83





.50
23
42
19
1.83
.51
22
40
18
1.82
.52
21
38
17
1.81
.53
20
37
17
1.85
.54
20
35
15
1.75
.55
19
34
15
1.79
.56
18
32
14
1.78
.57
17
31
14
1.82
.58
17
30
13
1.76
.59
16
29
13
1.81





.60
16
27
11
1.69
.61
15
26
11
1.73
.62
14
25
11
1.79
.63
14
24
10
1.71
.64
13
23
10
1.77
.65
13
23
10
1.77
.66
13
22
9
1.69
.67
12
21
9
1.75
.68
12
20
8
1.67
.69
11
19
8
1.73





.70
11
19
8
1.73
.71
11
18
7
1.64
.72
10
17
7
1.70
.73
10
17
7
1.70
.74
10
16
6
1.60
.75
9
16
7
1.78
.76
9
15
6
1.67
.77
9
14
5
1.56
.78
9
14
5
1.56
.79
8
13
5
1.63





.80
8
13
5
1.63
.81
8
12
4
1.50
.82
8
12
4
1.50
.83
7
12
5
1.71
.84
7
11
4
1.57
.85
7
11
4
1.57
.86
7
10
3
1.43
.87
6
10
4
1.67
.88
6
10
4
1.67
.89
6
9
3
1.50





.90
6
9
3
1.50
.91
6
8
2
1.33
.92
6
8
2
1.33
.93
5
8
3
1.60
.94
5
7
2
1.40
.95
5
7
2
1.40
.96
5
7
2
1.40
.97
5
6
1
1.20
.98
6


.99
5



Note. Sample size information computed using R package pwr, using variations on the following code

pwr.r.test(n = , r = , sig.level = 0.05, power = .80, alternative = "greater")