Question

使用here中的答案，我已经成功地按组计算了最小值和最大值。这次它不起作用，我不明白为什么。这是一个可重复的例子。

example <- structure(
  list(ID = 1:10, 
       date = 
         c("2005-05-09", "2006-09-18", "1996-06-14", "1997-01-06", 
           "1997-03-13", "1997-05-06", "1990-01-04", "1990-01-11", 
           "1989-12-28", "1989-12-28"), 
       name = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b")), 
  .Names = c("ID", "date", "name"), 
  class = c("data.table", "data.frame"), 
  row.names = c(NA, -10L))

example[example[, .I[which.min(date)], by=c("name")]$V1]

我期待的是：

1996-06-14    a
1989-12-28    b

但我获得了一个空数据表。为什么呢？

Answer 1

下面是：

library(data.table)
DT <- as.data.table(example)

1）如果您在代码中将date替换为xtfrm(date)，则可以使用。

DT[DT[, .I[which.min(xtfrm(date))], by=c("name")]$V1]

，并提供：

   ID       date name
1:  3 1996-06-14    a
2:  9 1989-12-28    b

2）这为每个组提供了一个最小值：

DT[, .SD[which.min(xtfrm(date))], by = name]

，并提供：

   name ID       date
1:    a  3 1996-06-14
2:    b  9 1989-12-28

3）这为每个组提供了所有最小值：

DT[, .SD[date == min(date)], by = name]

，并提供：

   name ID       date
1:    a  3 1996-06-14
2:    b  9 1989-12-28
3:    b 10 1989-12-28

在R中按组计算最小值的麻烦

1 个答案: