如何根据多列

时间:2018-05-31 09:49:40

标签: r dataframe

我创建了以下数据框

 df<-data.frame("A"<-(1:5), "B"<-c("A","B", "C", "B",'C' ), "C"<-c("A", "A", 
"B", 'B', "B"))
names(df)<-c("A", "B", "C")

我很想在输出后获得A列和C列之间的重复值,并在B列中添加相应的值。预期的数据框应该是

    df2<- "B"   "Dupvalues"
           1      A
            4     B

我无法做到这一点。我在这里请求一些帮助

1 个答案:

答案 0 :(得分:1)

df<-data.frame(A = (1:5), 
               B = c("A","B", "C", "B",'C' ), 
               C = c("A", "A","B", 'B', "B"), stringsAsFactors = F)

library(dplyr)

df %>%
  filter(B == C) %>%           # keep rows when B equals C
  group_by(A) %>%              # for each A
  transmute(DupValues = B) %>% # keep the duplicate value
  ungroup()                    # forget the grouping

# # A tibble: 2 x 2
#       A DupValues
#   <int> <chr>    
# 1     1 A        
# 2     4 B 

请注意,如果您的变量不是因子,而是字符变量,则此方法有效。