将多列应用于参数

时间:2012-02-03 21:12:19

标签: r dataset plyr apply

我有大量的设置会有很多变量:

set.seed (14)
pool = sample (c("AA","AB", "BB"), 100, replace = T) 
mydf <- data.frame (M1= pool[1:10], M2= pool[11:20],
M3= pool[21:30], M4= pool[31:40],  M5= pool[41:50], 
  M6= pool[51:60],  M7= pool[61:70], M8 = pool[71:80], 
  M9 = pool[81:90],  M10 = pool[91:100])

如果以前安装过,需要安装“hapassoc”软件包。

  

install.packages( “hapassoc”)

>  library(hapassoc)
> example1.haplos <- pre.hapassoc(mydf, numSNPs = 3, allelic= F)

Haplotypes will be based on the following SNPs (genotypic format): 
 M8, M9, M10 
Remaining variables are: 
 M1, M2, M3, M4, M5, M6, M7

组中最后3个变量。但是1要通过按组将数据分成更小的部分来应用此功能 -

M1, M2, M3   group 1
M4, M5       group 2
M6, M7, M8   group 3
M9, M10      group 4 

因此,numSNP将由以下向量表示:

nsp <- c(3, 2, 3, 2)

我想保留每个组的$ haploMat

example1.haplos$haploMat
 haplo1 haplo2
1    hBBA   hBAB
3    hAAB   hABB
4    hABA   hABA
6    hAAA   hBBA
7    hAAA   hAAA
8    hBBA   hBBB
9    hABB   hBBB
10   hABA   hBAB
12   hAAA   hBBB
13   hAAB   hBBA
14   hABA   hABA
15   hAAB   hBAB

最终输出有八列group1.haplo1,goup1.haplo2,group2.haplo1,group2.haplo2,group3.haplo1,group4.haplo1,group4.haplo2。

我怎样才能做到这一点?

1 个答案:

答案 0 :(得分:1)

这就是你要追求的吗? (将组的列号指定为分配给grps的列表的元素)。您需要安装reshape2包。您可以使用rbind.fill()包中的plyr执行类似操作。

set.seed (14)
pool = sample (c("AA","AB", "BB"), 100, replace = T) 
mydf <- data.frame (M1= pool[1:10], M2= pool[11:20],
M3= pool[21:30], M4= pool[31:40],  M5= pool[41:50], 
  M6= pool[51:60],  M7= pool[61:70], M8 = pool[71:80], 
  M9 = pool[81:90],  M10 = pool[91:100])

library(hapassoc)

grps <- list(1:3, 4:5, 6:8, 9:10)
haplos <- lapply(grps, function(x) {
    out <- pre.hapassoc(mydf[, x], numSNPs=length(x), allelic=F, 
      verbose=F)$haploMat
    row.names(out) <- as.numeric(row.names(out))
    out
})
haplos <- lapply(haplos, t)
library(reshape2)
haplos <- melt(haplos,value.name='haplotype')
haplos <- dcast(haplos, Var2 ~ L1 + Var1, value.var='haplotype')

RESULT

haplos

   Var2 1_haplo1 1_haplo2 2_haplo1 2_haplo2 3_haplo1 3_haplo2 4_haplo1 4_haplo2
1     1     hABA     hABB      hBA      hBA     hAAA     hAAB      hAA      hAA
2     2     <NA>     <NA>      hAB      hAB     hAAB     hABB      hAA      hAA
3     3     hBAA     hAAB      hBA      hBB     hBBB     hBAA      hAA      hBA
4     4     hBBB     hBAA      hBA      hAB     <NA>     <NA>      hAB      hBB
5     5     <NA>     <NA>      hBB      hAA     hABB     hAAA      hAB      hBB
6     6     hABB     hBBB      hBA      hBB     hABA     hAAB      hBB      hBB
7     7     hBBB     hBBB      hAA      hAA     hBBB     hBAA      hAB      hBB
8     8     hBBB     hABA      hBA      hAB     <NA>     <NA>      hAA      hAA
9     9     <NA>     <NA>      hBB      hAA     hAAB     hAAB      hAA      hAB
10   10     hBBB     hBAA      hAA      hBA     hABB     hBBB      hAB      hAB
11   11     <NA>     <NA>      hBB      hBB     hBBA     hBBB     <NA>     <NA>
12   12     hBBB     hABA      hAB      hBB     hABA     hABB     <NA>     <NA>
13   13     <NA>     <NA>     <NA>     <NA>     hABB     hBAA     <NA>     <NA>
14   14     hABB     hBBB     <NA>     <NA>     <NA>     <NA>     <NA>     <NA>
15   15     <NA>     <NA>     <NA>     <NA>     hAAB     hBBA     <NA>     <NA>
16   16     hBAA     hABA     <NA>     <NA>     hAAA     hBBB     <NA>     <NA>