在r中创建所有可能的三元组(三个时间)组合

时间:2012-07-01 12:44:35

标签: r combinations

以下是我案例的示例数据:

        mark <- c(paste("M", 1:6, sep = "")); set.seed(123); 
    Ind1 <- c(sample (c("A", "B", "H"), 6, replace = T)); 
    set.seed(1234); Ind2 <- c(sample (c("A", "B", "H"), 6, replace = T));
      set.seed(12345); Ind3 <- c(sample (c("A", "B", "H"), 6, replace = T));
     set.seed (12344); 
    Ind4 <- c(sample (c("A", "B", "H"), 6, replace = T)); 
      set.seed(1234567); Ind5 <- c(sample (c("A", "B", "H"), 6, replace = T));
     myd <- data.frame (mark, Ind1, Ind2, Ind3, Ind4, Ind5)

数据

 myd
  mark Ind1 Ind2 Ind3 Ind4 Ind5
1   M1    A    A    H    A    B
2   M2    H    B    H    H    H
3   M3    B    B    H    A    H
4   M4    H    B    H    A    A
5   M5    H    H    B    A    H
6   M6    A    B    A    H    B

我想比较每个变量(列)的所有可能(三次 - 每次3个)比较标记。

M1 & M2 & M3      -> first composition 
M1 & M2 & M4      - > second comparison 
M1 & M2 & M5
M1 & M2 & M6 
M1 & M3 & M4
M1 & M3 & M5
M1 & M3 & M6
M2 & M3 & M4
M2 & M3 & M5
M2 & M3 & M6 
......................so on 

因此,对于比较三元组,循环将是: T =三联体成员,T1 =第一,T2 =第二,T3 =第三

nevar <- 0

 if (T1 =="A", T2 == "B", T3 == "H"){
      newvar[i] <- 0
      }
       else{
      if (T1 =="A", T2 == "B", T3 == "B"){
       newvar[i] <- 1
       } else {
         if (T1 =="A", T2 == "A", T3 == "H"){
        newvar[i] <- 1
        } else {
        newvar[i] <- "NA"
        }
        }}

我怎样才能做到这一点?

编辑:

lets do for Ind1:

first comparison this above list
value of T1 = M1 = "A", T2 = M2 = "H", T3= M3 = "B"
              newvar = "NA"

Similarly second comparison:
T1 = M1 = "A", T2 = M2 = "H",  T3 = M4 = "H"
                newvar = "NA"

M1 .... M6 rownames(如变量)我可以将它应用于所有Ind1 .... Ind6, 一旦为Ind1做好准备

1 个答案:

答案 0 :(得分:2)

要创建可能的组合,您可以使用

combins<-t(combn(levels(myd$mark)[myd$mark],3))

然后你可以创建一个函数说

dum.fun<-function(x,myd){
dum.match<-match(x,myd$mark)
dum.str<-""
dum.ans<-c()
for(i in 2:6){
dum.str<-paste(myd[dum.match,i],collapse="")
dum.ans[i-1]<-NA
if(dum.str=="ABH"){
dum.ans[i-1]<-0}else{
if(dum.str=="ABB"||dum.str=="AAH"){
dum.ans[i-1]<-1
}}
}
dum.ans
}

然后

out<-t(apply(combins,1,dum.fun,myd))
cbind(combins,out)
> head(cbind(combins,out))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] "M1" "M2" "M3" NA   "1"  NA   NA   NA  
[2,] "M1" "M2" "M4" NA   "1"  NA   NA   NA  
[3,] "M1" "M2" "M5" NA   "0"  NA   NA   NA  
[4,] "M1" "M2" "M6" NA   "1"  NA   NA   NA  
[5,] "M1" "M3" "M4" "0"  "1"  NA   NA   NA  
[6,] "M1" "M3" "M5" "0"  "0"  NA   NA   NA  

例如

这一切都相当混乱,但希望我已经掌握了你想要的东西。

或一次通话

t(combn(levels(myd$mark)[myd$mark],3,dum.fun,myd=myd))