Question

我有一个这样的数据框：

     Id relationship age
1  1001            1  60
2  1001            2  50
3  1001            3  20
4  1002            1  70
5  1002            2  68
6  1002            3  23
7  1002            3  27
8  1002            3  27
9  1002            3  23
10 1003            1  60
11 1003            2  40
12 1003            3  20
13 1003            3  20

我想在新列中为相同Id的所有成员写下每个Id的大年龄，并将其命名为maxage。我需要这个结果：

     Id relationship age maxage
1  1001            1  60     60
2  1001            2  50     60
3  1001            3  20     60
4  1002            1  70     70
5  1002            2  68     70
6  1002            3  23     70
7  1002            3  27     70
8  1002            3  27     70
9  1002            3  23     70
10 1003            1  60     60
11 1003            2  40     60
12 1003            3  20     60
13 1003            3  20     60

Answer 1

如果你的数据帧是df，那么

result <- aggregate(age~Id, df, max)
df <- merge(df,result,by="Id")
colnames(df)[3:4] <- c("age","max.age")
df
#      Id relationship age max.age
# 1  1001            1  60      60
# 2  1001            2  50      60
# 3  1001            3  20      60
# 4  1002            1  70      70
# 5  1002            2  68      70
# 6  1002            3  23      70
# 7  1002            3  27      70
# 8  1002            3  27      70
# 9  1002            3  23      70
# 10 1003            1  60      60
# 11 1003            2  40      60
# 12 1003            3  20      60
# 13 1003            3  20      60

您也可以使用data.tables执行此操作，我建议实际使用它，因为它更简单，更快。

library(data.table)
dt <- data.table(df)
dt[,max.age:=max(age),by=Id]
# head(dt)
# 1: 1001            1  60      60
# 2: 1001            2  50      60
# 3: 1001            3  20      60
# 4: 1002            1  70      70
# 5: 1002            2  68      70
# 6: 1002            3  23      70

Answer 2

另一种选择是

> library(plyr)
> 
> ddply(ages, .(Id), function(df) {df$max.age = max(df$age); df})
     Id relationship age max.age
1  1001            1  60      60
2  1001            2  50      60
3  1001            3  20      60
4  1002            1  70      70
5  1002            2  68      70
6  1002            3  23      70
7  1002            3  27      70
8  1002            3  27      70
9  1002            3  23      70
10 1003            1  60      60
11 1003            2  40      60
12 1003            3  20      60
13 1003            3  20      60

我应该如何为该组的所有成员编写最大组号？

2 个答案: