我有一个表格,当我的列都有( A到B方向)中的数据时,我可以使用聚合但我想知道有没有办法在值时聚合或使用dplyr在A列和B列中双向显示。例如,A列和B列中的值可以显示在以下方向( A到B或B到A )。
library(data.table)
exampleset <-data.table(ColumnA = c("A2","A1","A3","A3","A4","A5"),
ColumnB = c("A1","A2","A4","A3","A3","A5"),
Colorcode = c("red","green","blue","yellow","red","red"))
期望的输出:
output <- data.table(ColumnA =c("A1","A3","A3","A5"),
ColumnB=c("A2","A4","A3","A5"),
ColorcodeCount =c(2,2,1,1))
答案 0 :(得分:0)
对于这个特定情况,最好的解决方案是,David Arenburg使用pmin/pmax
:
exampleset[, .(Colorcode = uniqueN(Colorcode)), by = .(ColumnA = do.call(pmin, list(ColumnA, ColumnB)),
ColumnB = do.call(pmax, list(ColumnA, ColumnB)))]
但是,对于您可能希望按3列而不是2列进行排序的情况,这不是很普遍。
或者,我使用mapply
(使用apply
更新)的解决方案是:
您可以创建始终具有相同顺序的列(因此A1\A2
将被视为与A2/A1
相同),然后按这些无序列进行分组。类似的东西:
exampleset2 <- exampleset[,c("unorderA","unorderB") := data.frame(t(mapply(FUN = function(...) c(...)[order(c(...))], ColumnA, ColumnB, USE.NAMES = FALSE)))]
exampleset2[,list(ColorcodeCount = length(unique(Colorcode))), by = .(unorderA, unorderB)]
# unorderA unorderB ColorcodeCount
#1: A1 A2 2
#2: A3 A4 2
#3: A3 A3 1
#4: A5 A5 1
另一方面,如果你想要想要在一次通话中完成所有操作,另一种方式是:
exampleset[,list(ColorcodeCount = length(unique(Colorcode))),
by = .(t(mapply(FUN = function(...) c(...)[order(c(...))], ColumnA, ColumnB, USE.NAMES = FALSE))[,1],
t(mapply(FUN = function(...) c(...)[order(c(...))], ColumnA, ColumnB, USE.NAMES = FALSE))[,2])]
# t t.1 ColorcodeCount
#1: A1 A2 2
#2: A3 A4 2
#3: A3 A3 1
#4: A5 A5 1