我有一个df,我想为两列但不同行中的唯一ID的组合创建唯一ID,并以相反的顺序显示。一个人怎么能做到这一点?感谢您的帮助。
df:
> CALL_NO DUP_OF_CALL_NO
> 171573729 171573731
> 171573729 171573731
> 171630085 171630084
> 171630085 171630084
> 171573731 171573729
> 171573731 171573729
> 171630084 171630085
> 171630084 171630085
> 171573731 171573729
> 171573731 171573729
所需的输出:
> ID CALL_NO DUP_OF_CALL_NO
> 1 171573729 171573731
> 1 171573729 171573731
> 2 171630085 171630084
> 2 171630085 171630084
> 1 171573731 171573729
> 1 171573731 171573729
> 2 171630084 171630085
> 2 171630084 171630085
> 1 171573731 171573729
> 1 171573731 171573729
答案 0 :(得分:1)
df$ID=as.numeric(factor(apply(df,1,function(x)toString(sort(x)))))
> df
CALL_NO DUP_OF_CALL_NO ID
1 171573729 171573731 1
2 171573729 171573731 1
3 171630085 171630084 2
4 171630085 171630084 2
5 171573731 171573729 1
6 171573731 171573729 1
7 171630084 171630085 2
8 171630084 171630085 2
9 171573731 171573729 1
10 171573731 171573729 1
或者您也可以使用do.call
:
as.numeric(factor(do.call(function(x,y)paste(pmin(x,y),pmax(x,y)),unname(df))))
[1] 1 1 2 2 1 1 2 2 1 1
答案 1 :(得分:0)
创建示例数据集
set.seed(1234)
df <- data.frame(
call1 = sample(1:10, 100, T),
call2 = sample(1:10, 100, T),
stringsAsFactors = F
)
# load data.table package and convert df to a data.table
library(data.table)
setDT(df)
添加组指标:
# order df so group numbers are increasing by (call1, call2)
df <- df[order(call1, call2)]
# add group id
df[, id := .GRP, by=.(call1, call2)]
答案 2 :(得分:0)
#first standardize order of CALL_NO and DUP_OF_CALL_NO to create groupings
df = df %>%
mutate( call1 = pmin(CALL_NO, DUP_OF_CALL_NO),
call2 = pmax(CALL_NO, DUP_OF_CALL_NO))
###Create unique ID across related CALL_NO and DUP_OF_CALL_NO
#convert to data.table
df<- data.table(df)
#apply ID to associate group of calls
df[, ID := .GRP, by=.(call1, call2)]
#convert back to data.frame
class(as.data.frame(df))