关于为R中的选定列计算数据表中的行均值的问题

时间:2019-02-07 22:44:39

标签: r

我有一个数据表,如下所示。

Table:

LP   GMweek1  GMweek2   GMweek3  PMweek1   PMweek2  PMweek3
215   45       50        60       11        0.4     10.2
0.1   50       61        24       12        0.8     80.0
0     45       24        35       22        20.0    15.4
51    22.1     54        13       35        16      2.2  

我想获得下面的输出表。我下面的代码不起作用。有人可以帮我弄清楚我在做什么错。

感谢您的帮助。

Output:

LP   GMweek1  GMweek2   GMweek3  PMweek1   PMweek2  PMweek3  AvgGM   AvgPM
215   45       50        60       11        0.4     10.2     51.67   7.20
0.1   50       61        24       12        0.8     80.0     45.00   30.93
0     45       24        35       22        20.0    15.4     34.67   19.13
51    22.1     54        13       35        16      2.2      29.70   17.73

sel_cols_GM <- c("GMweek1","GMweek2","GMweek3")
sel_cols_PM <- c("PMweek1","PMweek2","PMweek3")

Table <- Table[, .(AvgGM = rowMeans(sel_cols_GM)), by = LP]
Table <- Table[, .(AvgPM = rowMeans(sel_cols_PM)), by = LP]

1 个答案:

答案 0 :(得分:2)

好,所以您做错了几件事。首先,rowMeans无法评估字符向量,如果要使用字符向量来选择列,则必须使用.SD并将字符向量传递给.SDcols。其次,您正在尝试计算行聚合和分组,我认为这没有多大意义。第三,即使您的表达式没有引发错误,您仍将其分配回Table,这将破坏您的原始数据(如果您想添加新列,请使用:=来添加它参考)。

您要做的是计算所选列的行均值,您可以像这样进行操作:

Table[, AvgGM := rowMeans(.SD), .SDcols = sel_cols_GM] 
Table[, AvgPM := rowMeans(.SD), .SDcols = sel_cols_PM]

这意味着创建这些新列作为我的数据子集(.SD)的行均值,该数据子集引用这些列(.SDcols