计算数据框R和添加到列的差异

时间:2015-11-02 21:53:33

标签: r dataframe duplicates unique

我希望能够按表中的订单号比较差异,并附加一个说明差异的列。例如,我想要这个

  order  color type    shape             alert
1     1   blue    a   circle             type
2     1   blue    b   circle                  
3     2  green    a   circle             color
4     2   blue    a   circle  color type shape
5     2 yellow    b triangle              type
6     2 yellow    c triangle                  
7     3 orange    c triangle                  

看起来像这样

  order  color type    shape             alert
1     1   blue    a   circle             type
2     1   blue    b   circle                  
3     2  green    a   circle             color type shape
4     2   blue    a   circle  
5     2 yellow    b triangle              
6     2 yellow    c triangle                  
7     3 orange    c triangle                  

我的代码只比较了彼此相邻的2行,如何有效地比较具有相同订单号的所有行?我可以避免循环吗?这是我的代码

order = c(0001, 0001, 0002, 0002, 0002, 0002, 0003) 
color = c("blue", "blue", "green", "blue", "yellow", "yellow", "orange") 
type = c("a", "b", "a", "a", "b", "c", "c") 
shape = c("circle", "circle", "circle", "circle", "triangle", "triangle",    "triangle") 
df = data.frame(order, color, type, shape)

df$alert <- ""

for(i in 1:nrow(df)-1){
  if(identical(df$order[i+1],df$order[i])){
    if(!identical(df$color[i+1],df$color[i])){
      df$alert[i] <- paste(df$alert[i],"color")
    }
    if(!identical(df$type[i+1],df$type[i])){
      df$alert[i] <- paste(df$alert[i],"type")
    }
    if(!identical(df$shape[i+1],df$shape[i])){
      df$alert[i] <- paste(df$alert[i],"shape")
    }
  }
}

1 个答案:

答案 0 :(得分:0)

以下是基于dplyr的解决方案:

library(dplyr)
dat1 %>% gather(measure, val, -order) %>%
         group_by(order, measure) %>%
         summarise(alerts = length(unique(val))) %>%
         filter(alerts>1) %>%
         summarise(alerts = paste0(measure, collapse = " ")) %>%
         left_join(dat1, .)

  order  color type    shape           alerts
1     1   blue    a   circle             type
2     1   blue    b   circle             type
3     2  green    a   circle color type shape
4     2   blue    a   circle color type shape
5     2 yellow    b triangle color type shape
6     2 yellow    c triangle color type shape
7     3 orange    c triangle             <NA>