聚合函数 - 在数据框架中保留NA

时间:2015-02-02 16:53:20

标签: r statistics

我想使用R的聚合函数来聚合几个字段的价格。但是,我的数据中也有NA,我想保留。

尝试:

> dput(df)
structure(list(ID = c(1L, 2L, 3L, 4L, 4L, 1L, 2L, 3L, 4L, 1L, 
2L, 3L, 4L, 3L, 2L, 1L), REFERENCE = c("TEST1", "TEST2", "TEST3", 
"TEST4", "TEST1", "TEST2", "TEST3", "TEST4", "TEST1", "TEST2", 
"TEST3", "TEST4", "TEST1", "TEST2", "", "TEST2"), ISS = c(1234L, 
1234L, 1111L, 1111L, 1234L, 1111L, 1234L, 1111L, 1234L, NA, 1234L, 
1111L, 1234L, 1111L, 1234L, NA), Price = c(10L, NA, 20L, NA, 
10L, 12L, NA, 99L, 100L, NA, 100L, 12L, NA, 11L, 0L, 12L)), .Names = c("ID", 
"REFERENCE", "ISS", "Price"), row.names = c(NA, -16L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000000000100788>)
> 
> df <- aggregate(df$Price, by=list(ID=df$ID, REFERENCE=df$REFERENCE, ISS=df$ISS), FUN=sum)

设置na.action = na.pass,给我:7

Error in aggregate.data.frame(as.data.frame(x), ...) : 
  no rows to aggregate

结果我希望:

enter image description here

因此,我想将我的NA数据保存在我的df中。

任何建议如何实现?

感谢您的回复!

1 个答案:

答案 0 :(得分:2)

我们可以使用data.table方法,而不是在“data.table”上使用aggregate。在按“ID / REFERENCE / ISS”(sum分组后,我们得到价格sum(Price, na.rm=TRUE) by=list(ID, REFERENCE, ISS)]。按“ID”,“REFERENCE”(如果需要)订购输出< / p>

 library(data.table)
 df[, sum(Price, na.rm=TRUE), by = list(ID, REFERENCE, ISS)][
                                     order(ID, REFERENCE)]
 #   ID REFERENCE  ISS  V1
 #1:  1     TEST1 1234  10
 #2:  1     TEST2 1111  12
 #3:  1     TEST2   NA  12
 #4:  2           1234   0
 #5:  2     TEST2 1234   0
 #6:  2     TEST3 1234 100
 #7:  3     TEST2 1111  11
 #8:  3     TEST3 1111  20
 #9:  3     TEST4 1111 111
 #10:  4     TEST1 1234 110
 #11:  4     TEST4 1111   0