如何使用dplyr :: Distinct基于另一个变量的值

时间:2018-03-04 07:50:55

标签: r tidyverse

library(tidyverse)

使用下面的示例数据,我想根据条件使用dplyr :: distinct()。我想消除ID列中的重复项,但只删除具有最低“Rate”值的重复项。例如,对于“A1A1”,应减去速率为2的行,而对于“CC33”,应删除“速率”等于2和3的行。我还想通过使用dplyr :: distinct和“.keep_all = TRUE”来结束所有列。

我尝试了下面的代码,但这会删除“主题”列。

DF2%>%group_by(ID)%>%summarise(Min_rate=min(Rate))

我还玩过group_by,mutate和if_else,但无法让它工作......

DF2%>%group_by(ID)%>%mutate(if_else(Rate=min(Rate),Rate,distinct(ID)

帮助将不胜感激......

示例数据:

ID<-c("A1A1","A22B","CC33","D33D","A1A1","4DD8","4DD8","CC33","CC33","56DK","F4G5","8Y0R")
Subject<-c("Subject1","Subject2","Subject3","Subject4","Subject5","Subject6","Subject7","Subject8","Subject9","Subject10","Subject11","Subject12")
Rate<-c(1,2,3,2,2,3,2,1,2,2,2,3)
DF2<-data_frame(ID,Subject,Rate)

1 个答案:

答案 0 :(得分:0)

我找到了一种方法来实现我想要的东西,首先使用dplyr&#39; s&#34; group_by&#34;和#34;变异&#34;功能与&#34; if_else&#34;使用1重新编码每个ID组中的rate变量的最小值,并使用0

重新编码所有其他值
DF2<-DF2%>%group_by(ID)%>%mutate(Rate_Min=if_else(Rate==min(Rate),1,0))

然后我使用dplyr&#39; s&#34;过滤&#34;删除0。

DF2<-DF2%>%filter(Rate_Min==1)