Meta分析的数据准备 - 使用Metafor

时间:2015-06-14 03:42:43

标签: r

为了计算效应大小并对连续结果的二分预测值进行元分析( d g ),由均值组成的数据帧,sd和每项研究的样本量都是必需的。

我试图编写一些代码,用于从原始数据创建所需的数据框。这意味着不必为每项研究手动完成此过程。

示例原始数据集

Study <- c("andrew", "andrew", "andrew", "andrew", "peters", "peters", "peters", "jess", "jess", "jess")
Score = c(100, 308, 584, 241, 241, 111, 431, 123, 321, 411)
Sex = c(1, 1, 1, 2, 2, 1, 2, 2, 1, 1)
data = cbind(Score, Sex, Study)
data

 >     Score Sex Study   
 > [1,] "100" "1" "andrew"
 > [2,] "308" "1" "andrew"
 > [3,] "584" "1" "andrew"
 > [4,] "241" "2" "andrew"
 > [5,] "241" "2" "peters"
 > [6,] "111" "1" "peters"
 > [7,] "431" "2" "peters"
 > [8,] "123" "2" "jess"  
 > [9,] "321" "1" "jess"  
> [10,] "411" "1" "jess" 

如何将 metafor 按性别和学习方式将数据转换为以下文件?

Study       MeanMale   MeanFemale   SDMale    SDfemale    NrowsMale    NrowsFemale
andrew         X           X          X          X            X             X
peters         X           X          X          X            X             X
jess           X           X          X          X            X             X

我认为使用describeBy,statsBy或Splitdata与sapply会起作用,但是将它变成所需的格式是混乱的。下一个目标是引入年级专栏,例如,

Study <- c("andrew", "andrew", "andrew", "andrew", "peters", "peters", "peters", "jess", "jess", "jess") 
Score = c(100, 308, 584, 241, 241, 111, 431, 123, 321, 411)
Sex = c(1, 1, 1, 2, 2, 1, 2, 2, 1, 1) 
Year = (1992, 1992, 1992, 1992, 1988, 1988, 1988, 1977, 1977, 1977) 
data = cbind(Study, Year, Score, Sex) 

生成以下data.frame

Study      Year  MeanMale   MeanFemale   SDMale    SDfemale    NrowsMale    NrowsFemale
andrew     1992    X           X          X          X            X             X
peters     1988    X           X          X          X            X             X
jess       1977    X           X          X          X            X             X

2 个答案:

答案 0 :(得分:1)

我们可以使用setDT(data)的开发版本,即v1.9.5。安装devel版本的说明是here

我们将'data.frame'转换为'data.table'(mean),按'Sex'和'Study'分组,得到sd.N和{ {1}}(nrows),并使用dcast(来自data.table,可以将多个value.var列)从“长”格式转换为“宽”格式。

library(data.table)#v1.9.5+
dcast(setDT(data)[, list(Mean= mean(Score), SD= sd(Score), Nrows=.N), 
.(Sex, Study)], Study~ c('Male', 'Female')[Sex], 
          value.var=c('Mean', 'SD', 'Nrows'))
#     Study Female_Mean Male_Mean Female_SD   Male_SD Female_Nrows Male_Nrows
#1: andrew         241  330.6667        NA 242.79484            1          3
#2:   jess         123  366.0000        NA  63.63961            1          2
#3: peters         336  111.0000  134.3503        NA            2          1

修改

来自@ Arun的评论,来自dcast的{​​{1}}也接受了多项功能。

data.table

或者我们可以在使用dcast(setDT(data), Study ~ c('Male', 'Female')[Sex], fun.agg=list(mean, sd, length), value.var="Score") # Study Female_mean_Score Male_mean_Score Female_sd_Score Male_sd_Score #1: andrew 241 330.6667 NA 242.79484 #2: jess 123 366.0000 NA 63.63961 #3: peters 336 111.0000 134.3503 NA # Female_length_Score Male_length_Score #1: 1 3 #2: 1 2 #3: 2 1 获取reshapebase Rmean之后使用sd中的nrow

aggregate

数据

d1 <- do.call(data.frame,aggregate(Score~., transform(data, Sex=c('Male',
 'Female')[Sex]), FUN=function(x) c(Mean=mean(x), SD=sd(x), Nrows=length(x))))

reshape(d1, idvar='Study', timevar='Sex', direction='wide')
#  Study Score.Mean.Female Score.SD.Female Score.Nrows.Female Score.Mean.Male
#1 andrew               241              NA                  1        330.6667
#3   jess               123              NA                  1        366.0000
#5 peters               336        134.3503                  2        111.0000
#  Score.SD.Male Score.Nrows.Male
#1     242.79484                3
#3      63.63961                2
#5            NA                1

答案 1 :(得分:0)

这与dplyr和reshape2非常接近。我们将性别转换为命名因子,使用mutate按组获取SD和样本大小,然后融合并转换数据以获得具有良好变量名称的组的方法:

require(reshape2); require(dplyr)

data$Sex <- factor(data$Sex, levels = c(1, 2), labels = c('Male', 'Female'))
data <- mutate(group_by(data, Study), SD = sd(Score), Nrow = length(Score))
data <- melt(data, id.vars = c('Study', 'Sex'))
data$value <- as.numeric(data$value)
dcast(data, Study ~ variable + Sex, mean, na.rm = TRUE)