我有一个这样的数据框:
Server Avg_Cpu 95th
Server01 40 90
Server01 45 90
Server02 56 80
Server02 50 80
我需要对此数据框进行子集化并选择具有最高95和Avg_Cpu的唯一服务器
我最后的df会是这样的:
Server Avg_Cpu 95th
Server01 45 90
Server02 56 80
我尝试了dplyr包,如下所示:
df %>% group_by(Server) %>% filter(Avg_Cpu==max(Avg_Cpu))
不太正常,得到:
Error: filter condition does not evaluate to a logical vector.
答案 0 :(得分:1)
尝试使用dput(df)
或str(df)
检查data.frame
,df
的结构,因为这对我有用:
df <- read.table(textConnection("Server Avg_Cpu 95th
Server01 40 90
Server01 45 90
Server02 56 80
Server02 50 80"), header = T)
library(dplyr)
df %>%
group_by(Server) %>%
filter(Avg_Cpu == max(Avg_Cpu),
X95th == max(X95th))
# Source: local data frame [2 x 3]
# Groups: Server [2]
#
# Server Avg_Cpu X95th
# (fctr) (int) (int)
# 1 Server01 45 90
# 2 Server02 56 80
注意我的情况`str(df)返回:
# > str(df)
# 'data.frame': 4 obs. of 3 variables:
# $ Server : Factor w/ 2 levels "Server01","Server02": 1 1 2 2
# $ Avg_Cpu: int 40 45 56 50
# $ X95th : int 90 90 80 80