我有下表:
Class x2 x3 x4
A 14 45 53
A 8 18 17
A 16 49 20
B 78 21 48
B 8 18 5
我需要每个" Class" (A和B)在列" X3"中找到最大值,保留该行并删除其他行。
输出应采用以下格式:
Class x2 x3 x4
A 14 49 20
B 78 21 48
如果我的问题不清楚,请问我问题。
谢谢!
答案 0 :(得分:4)
基础R方法可能是:
mydf[as.logical(with(mydf, ave(x3, Class, FUN = function(x) x == max(x)))), ]
# Class x2 x3 x4
# 3 A 16 49 20
# 4 B 78 21 48
但请注意,如果max
绑定了多个值,则会为该组返回多行。
这是一个可能的" data.table"的方法:
library(data.table)
setkey(as.data.table(mydf), Class, x3)[, tail(.SD, 1), by = Class]
# Class x2 x3 x4
# 1: A 16 49 20
# 2: B 78 21 48
答案 1 :(得分:2)
使用dplyr
的一种方法是:
library(dplyr)
foo %>%
#For each Class
group_by(Class) %>%
# Sort rows in descending way using x3: you get the max x3 value on top
# for each group
arrange(desc(x3)) %>%
# Select the first row for each Class
slice(1)
# Class x2 x3 x4
#1 A 16 49 20
#2 B 78 21 48
修改强> 鉴于@Ananda的领带价值考虑和他在演出中的建议, 你也可以这样做。但是,@ Richard Acriven的想法是 如果有关系,那就该走了。
# Data
foo2 <- structure(list(Class = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), x2 = c(14L, 8L, 16L, 78L, 8L), x3 = c(49L,
18L, 49L, 21L, 18L), x4 = c(53L, 17L, 20L, 48L, 5L)), .Names = c("Class",
"x2", "x3", "x4"), class = "data.frame", row.names = c(NA, -5L
))
# Class x2 x3 x4
#1 A 14 49 53
#2 A 8 18 17
#3 A 16 49 20
#4 B 78 21 48
#5 B 8 18 5
foo2 %>%
group_by(Class) %>%
mutate(Rank = dense_rank(desc(x3))) %>%
filter(Rank == 1)
# Class x2 x3 x4 Rank
#1 A 14 49 53 1
#2 A 16 49 20 1
#3 B 78 21 48 1
答案 2 :(得分:2)
这是该批次的另一个dplyr
答案
library(dplyr)
df %>% group_by(Class) %>% filter(x3 == max(x3))
# Source: local data frame [2 x 4]
# Groups: Class
#
# Class x2 x3 x4
# 1 A 16 49 20
# 2 B 78 21 48
也可能是
group_by(df, Class) %>% filter(x3 == max(x3))
答案 3 :(得分:0)
尝试:
do.call(rbind, lapply(split(ddf, ddf$Class), function(x) tail(x[order(x$x3),],1)))
Class x2 x3 x4
A A 16 49 20
B B 78 21 48