我尝试通过每个ID获得唯一的组合,我一直得到错误,它不会扩展ID。
ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,4,4,4,5,5,5,5,5,6,6,6,6)
var1 <- c("A","B","E","F","C","D","C","A","B","C","A","D","B","C",
"A","B","C","A","D","C","A","B","C","E","F","G")
df1 <- data.frame(ID,var1)
df1 <- df1[order(df1$ID, df1$var1),]
dd <- unique(df1)
dd <- data.table(dd)
dd[,new4 := t(combn(sort(var1), m = 3))[,1],by= "ID"]
dd[,new5:= t(combn(sort(var1), m = 3))[,2],by="ID"]
dd[,new6:= t(combn(sort(var1), m = 3))[,3],by="ID"]
Warning message:
In `[.data.table`(dd, , `:=`(new4, t(combn(sort(var1), m = 3))[, :
RHS 1 is length 10 (greater than the size (5) of group 1). The last 5 element(s) will be discarded.
ID var1 new4 new5 new6
1: 1 A A B C
2: 1 B A B E
3: 1 C A B F
4: 1 E A C E
5: 1 F A C F
6: 2 A A B C
7: 2 B A B D
8: 2 C A C D
9: 2 D B C D
10: 3 A A B C
11: 3 B A B D
12: 3 C A C D
13: 3 D B C D
14: 4 A A B C
15: 4 B A B C
16: 4 C A B C
17: 5 A A B C
18: 5 B A B D
19: 5 C A C D
20: 5 D B C D
21: 6 C C E F
22: 6 E C E G
23: 6 F C F G
24: 6 G E F G
输出不能通过每个ID ID1(A,B,C,E,F)给出足够的组合,它只提供5种组合。无论如何解决问题?输出我想要ID1,有10种组合(ABC)(ACF)(ABF)(ABE)(BCE)(BCF)(CAB)(CAE)(CAF)(ECF)
答案 0 :(得分:0)
@BIN由于组合数通常与“Var1”的唯一字母数不匹配,您可以尝试以下方法:
library(dplyr)
dd[,var1:=as.character(var1)]
dd[,.(Numb.Combinations = choose(var1 %>% uniqueN,3),
ID1 = paste0(var1 %>% unique, collapse=""),
Combinations = paste(combn(var1,3,function(x) paste0(x,collapse = "")),collapse="-")),
by="ID"]
输出类似于您在最后请求的输出:
ID Numb.Combinations ID1 Combinations
1: 1 10 ABCEF ABC-ABE-ABF-ACE-ACF-AEF-BCE-BCF-BEF-CEF
2: 2 4 ABCD ABC-ABD-ACD-BCD
3: 3 4 ABCD ABC-ABD-ACD-BCD
4: 4 1 ABC ABC
5: 5 4 ABCD ABC-ABD-ACD-BCD
6: 6 4 CEFG CEF-CEG-CFG-EFG
如果您愿意,或者按照@akrun和@frank的建议,
dd <- dd[, c(ID1 = paste0(var1 %>% unique, collapse=""),
transpose(combn(sort(var1), 3, simplify = F))), by = ID]
colnames(dd) <- c("ID","ID1","New1","New2","New3")
输出:
ID ID1 New1 New2 New3
1: 1 ABCEF A B C
2: 1 ABCEF A B E
3: 1 ABCEF A B F
4: 1 ABCEF A C E
5: 1 ABCEF A C F
6: 1 ABCEF A E F
7: 1 ABCEF B C E
8: 1 ABCEF B C F
9: 1 ABCEF B E F
10: 1 ABCEF C E F
11: 2 ABCD A B C
12: 2 ABCD A B D
13: 2 ABCD A C D
14: 2 ABCD B C D
15: 3 ABCD A B C
16: 3 ABCD A B D
17: 3 ABCD A C D
18: 3 ABCD B C D
19: 4 ABC A B C
20: 5 ABCD A B C
21: 5 ABCD A B D
22: 5 ABCD A C D
23: 5 ABCD B C D
24: 6 CEFG C E F
25: 6 CEFG C E G
26: 6 CEFG C F G
27: 6 CEFG E F G
ID ID1 New1 New2 New3