Question

我试图比较4组，每组2个样本（重复），以了解每组中来自微阵列项目的10,000多个基因的差异。我正在使用排列方法并计算F统计量，因为样本量对于标准t检验来说太小了。

示例数据：

log10

使用F统计量和无差异的无效，我能够计算R中每个基因的obs F值。但是在进行排列时，我知道会有8C2 * 6C2 * 4C2 * 2C2需要考虑不同的成对排列。我无法在R中编码这些2520个排列。

我可以在R或SAS宏中找到一个可以帮助我获得排列的确切顺序的包吗？我尝试了'permute＆＃39;中的allPerms功能。包和来自“cominat”的permn功能包但我得到所有可能的排列而不是有限的排列。我看过Pearl和python中的代码来做限制性的修改，但是我不太熟悉修改代码。

对于Eg。对于2组（G1和G2）设置，每组重复2次（A1，A2和B1，B2）：

Gene    A1  A2  B1  B2  C1  C2  D1  D2
gene1  1.1 1.2 4.2 4.1 8.2 8.2 5.9 6.1 
gene2  2.7 2.6 3.1 2.9 7.2 7.8 7.1 7.0
.
.
gene10000 10.1 11.1 2.9 3.1 3.8 3.7 7.2 7.3

我想在4组中得到排列的确切顺序，每组2个样本，即2520行和8列。

谢谢！

Answer 1

我最终自己编写了一个函数来实现4,6或8个样本设置，成对排列：

>library(combinat)
> # function to show columns not in master data set
>"%w/o%" <- function(x, y) x[!x %in% y] #--  x without y
> permutations <- function(n, data)  {
  if(n %% 2 == 0)
  {
   if(n==2|n==4)
   {
    x <- t(combn(data,2))
    p <- nrow(x)
    A <- matrix(nrow= p , ncol = n )
    all <- matrix(data= c(rep(data)), nrow= p, ncol = n, byrow=T)
    for(i in 1: p)
    {
     A[i,] <- cbind(all[i,] %w/o% x[i,], x[i,])
    }
    return(matrix(A))
}
else if(n==6)
  {
   x <- t(combn(data,2))
   p <- nrow(x)
   A <- matrix(nrow= p*n , ncol = n ) # n=6
   all <- matrix(data= c(rep(data)), nrow= p, ncol = n, byrow=T)
   for( j in 1 :p)
   {
    absent <- all[j,] %w/o% x[j,]
    mat <- matrix(permutations(n-2, absent), ncol =n-2)
    present <- matrix(rep(x[j,], each = n), ncol =2)
    A[(j-1)*n+1:n,] <- cbind(present, mat)
   }
   return(matrix(A))
  }
else if(n==8)
  {
   x <- t(combn(data,2))
   p <- nrow(x)
   A <- matrix(nrow= 6*15*p , ncol = n ) # n=8
   all <- matrix(data= c(rep(data)), nrow= p, ncol = n, byrow=T)
   for( j in 1:p)
    {
    absent <- all[j,] %w/o% x[j,]
    mat <- matrix(permutations(n-2, absent), ncol =n-2)
    present <- matrix(rep(x[j,], each = 90), ncol =2)
    A[(j-1)*90+1:90,] <- cbind(present, mat)
   }
   return(matrix(A))
  }
}
   else 
    return("NA");
}

# m<- matrix(permutations(6, LETTERS[1:6]), ncol =6)
m <- matrix(permutations(8, LETTERS[1:8]), ncol =8)

4组之间限制排列，每组重复2次

1 个答案: