如何将summarySE应用于整个数据框以输出数据框?

时间:2015-11-03 11:47:05

标签: r

我正在使用summarySE(http://www.inside-r.org/packages/cran/rmisc/docs/summarySE),它提供特定“measurevar”的摘要统计信息。我可以很容易地将函数应用于数据框中的单个列,但我正在尝试使用单个列作为指示符来获取数据框中每列的均值,sd和n。数据如下所示:

row.names   INDICATOR   V1          V2          V3          V4          V5          V6          V7          V8          V9
S1          high        5.374010    3.729971    3.980272    3.833704    5.842162    12.45123    4.443093    5.289410    6.156557
S2          high        5.038479    3.991111    4.086205    3.562861    5.456350    10.87315    5.613356    4.983482    4.533033
S3          low         5.875899    3.787800    4.673221    3.615008    7.484733    11.46284    4.854490    6.272030    8.048471
S4          low         5.725970    4.424558    4.289177    3.661384    6.465843    11.42358    4.819001    4.530732    7.691810
S5          high        5.856858    3.710087    4.540943    3.575522    6.064775    11.26261    4.424541    4.989965    5.957384
S6          low         4.976248    3.747748    3.830143    3.522880    5.099448    11.17344    4.610697    5.578816    5.388057
S7          high        5.748943    6.361523    4.220688    3.615529    6.699602    10.77316    4.271772    4.656495    6.058274
S8          high        6.140979    4.514577    3.878116    3.722885    5.279296    10.47886    5.244666    5.347839    5.211714
S9          low         4.677525    4.378035    4.639693    3.636484    6.341705    11.25809    4.452191    4.487125    7.306832
S10         high        5.262167    5.364728    4.212417    3.721577    5.611512    11.56090    5.512644    4.675201    6.656299

我需要的数据如下所示:

row.names    high_mean    high_sd    high_n    low_mean    low_sd    low_n
V1           5.214657     0.013264   6         4.13246     0.023869  5
V2           5.214657     0.013264   6         4.13246     0.023869  5
V3           5.214657     0.013264   6         4.13246     0.023869  5
V4           5.214657     0.013264   6         4.13246     0.023869  5
V5           5.214657     0.013264   6         4.13246     0.023869  5
V6           5.214657     0.013264   6         4.13246     0.023869  5
V7           5.214657     0.013264   6         4.13246     0.023869  5
V8           5.214657     0.013264   6         4.13246     0.023869  5
V9           5.214657     0.013264   6         4.13246     0.023869  5

我一直在尝试执行这样的命令:

data_summary <- apply(df, 2, summarySE(measurevar = x, groupvars = indicator))

但我一直收到这个错误:

Error in mapvalues(x, from = names(replace), to = replace, warn_missing = warn_missing) : 
object 'x' not found

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

我们可以遍历'V'列,在每个列上应用summarySET指定'groupvars',使用rbindlist创建单个data.table,创建一个序列列('ind' ),并使用dcast从'long'转换为'wide'格式。

library(data.table)
DT <- rbindlist(lapply(grep('^V', names(df1), value=TRUE), 
         function(x) summarySE(df1[c(x, 'INDICATOR')], measure=x,groupvars='INDICATOR')))

setnames(DT, 3, 'mean')
DT[, ind := 1:.N, INDICATOR]
dcast(DT, ind~INDICATOR, value.var=c('V1', 'sd' , 'se', 'ci'))