我正在使用summarySE(http://www.inside-r.org/packages/cran/rmisc/docs/summarySE),它提供特定“measurevar”的摘要统计信息。我可以很容易地将函数应用于数据框中的单个列,但我正在尝试使用单个列作为指示符来获取数据框中每列的均值,sd和n。数据如下所示:
row.names INDICATOR V1 V2 V3 V4 V5 V6 V7 V8 V9
S1 high 5.374010 3.729971 3.980272 3.833704 5.842162 12.45123 4.443093 5.289410 6.156557
S2 high 5.038479 3.991111 4.086205 3.562861 5.456350 10.87315 5.613356 4.983482 4.533033
S3 low 5.875899 3.787800 4.673221 3.615008 7.484733 11.46284 4.854490 6.272030 8.048471
S4 low 5.725970 4.424558 4.289177 3.661384 6.465843 11.42358 4.819001 4.530732 7.691810
S5 high 5.856858 3.710087 4.540943 3.575522 6.064775 11.26261 4.424541 4.989965 5.957384
S6 low 4.976248 3.747748 3.830143 3.522880 5.099448 11.17344 4.610697 5.578816 5.388057
S7 high 5.748943 6.361523 4.220688 3.615529 6.699602 10.77316 4.271772 4.656495 6.058274
S8 high 6.140979 4.514577 3.878116 3.722885 5.279296 10.47886 5.244666 5.347839 5.211714
S9 low 4.677525 4.378035 4.639693 3.636484 6.341705 11.25809 4.452191 4.487125 7.306832
S10 high 5.262167 5.364728 4.212417 3.721577 5.611512 11.56090 5.512644 4.675201 6.656299
我需要的数据如下所示:
row.names high_mean high_sd high_n low_mean low_sd low_n
V1 5.214657 0.013264 6 4.13246 0.023869 5
V2 5.214657 0.013264 6 4.13246 0.023869 5
V3 5.214657 0.013264 6 4.13246 0.023869 5
V4 5.214657 0.013264 6 4.13246 0.023869 5
V5 5.214657 0.013264 6 4.13246 0.023869 5
V6 5.214657 0.013264 6 4.13246 0.023869 5
V7 5.214657 0.013264 6 4.13246 0.023869 5
V8 5.214657 0.013264 6 4.13246 0.023869 5
V9 5.214657 0.013264 6 4.13246 0.023869 5
我一直在尝试执行这样的命令:
data_summary <- apply(df, 2, summarySE(measurevar = x, groupvars = indicator))
但我一直收到这个错误:
Error in mapvalues(x, from = names(replace), to = replace, warn_missing = warn_missing) :
object 'x' not found
非常感谢任何帮助!
答案 0 :(得分:0)
我们可以遍历'V'列,在每个列上应用summarySET
指定'groupvars',使用rbindlist
创建单个data.table,创建一个序列列('ind' ),并使用dcast
从'long'转换为'wide'格式。
library(data.table)
DT <- rbindlist(lapply(grep('^V', names(df1), value=TRUE),
function(x) summarySE(df1[c(x, 'INDICATOR')], measure=x,groupvars='INDICATOR')))
setnames(DT, 3, 'mean')
DT[, ind := 1:.N, INDICATOR]
dcast(DT, ind~INDICATOR, value.var=c('V1', 'sd' , 'se', 'ci'))