我希望能够按表中的订单号比较差异,并附加一个说明差异的列。例如,我想要这个
order color type shape alert
1 1 blue a circle type
2 1 blue b circle
3 2 green a circle color
4 2 blue a circle color type shape
5 2 yellow b triangle type
6 2 yellow c triangle
7 3 orange c triangle
看起来像这样
order color type shape alert
1 1 blue a circle type
2 1 blue b circle
3 2 green a circle color type shape
4 2 blue a circle
5 2 yellow b triangle
6 2 yellow c triangle
7 3 orange c triangle
我的代码只比较了彼此相邻的2行,如何有效地比较具有相同订单号的所有行?我可以避免循环吗?这是我的代码
order = c(0001, 0001, 0002, 0002, 0002, 0002, 0003)
color = c("blue", "blue", "green", "blue", "yellow", "yellow", "orange")
type = c("a", "b", "a", "a", "b", "c", "c")
shape = c("circle", "circle", "circle", "circle", "triangle", "triangle", "triangle")
df = data.frame(order, color, type, shape)
df$alert <- ""
for(i in 1:nrow(df)-1){
if(identical(df$order[i+1],df$order[i])){
if(!identical(df$color[i+1],df$color[i])){
df$alert[i] <- paste(df$alert[i],"color")
}
if(!identical(df$type[i+1],df$type[i])){
df$alert[i] <- paste(df$alert[i],"type")
}
if(!identical(df$shape[i+1],df$shape[i])){
df$alert[i] <- paste(df$alert[i],"shape")
}
}
}
答案 0 :(得分:0)
以下是基于dplyr
的解决方案:
library(dplyr)
dat1 %>% gather(measure, val, -order) %>%
group_by(order, measure) %>%
summarise(alerts = length(unique(val))) %>%
filter(alerts>1) %>%
summarise(alerts = paste0(measure, collapse = " ")) %>%
left_join(dat1, .)
order color type shape alerts
1 1 blue a circle type
2 1 blue b circle type
3 2 green a circle color type shape
4 2 blue a circle color type shape
5 2 yellow b triangle color type shape
6 2 yellow c triangle color type shape
7 3 orange c triangle <NA>