数据

Question

我有一个数据框，格式为：

x <-
Chrom    sample1    sample2    sample3  ...
Contig12    0/0     0/0     0/1
Contig12    ./.     ./.     0/0
Contig28    0/0     0/0     0/0
Contig28    1/1     1/1     1/1
Contig55    0/0     0/0     0/1
Contig55    0/1     0/1     0/1
Contig61    ./.     0/1     1/1
.
.
.

有〜20000行和〜100个唯一列，我试图计算每个列（样本）中每个唯一状态发生的次数，以便得到：

         sample1    sample2     sample3     ...
./.      2          1           0
0/0      3          3           2
0/1      1          2           3
1/1      1          1           2

关于如何执行此操作的任何建议？我尝试使用plyr包中的count（），但无法弄清楚如何在每一列中使用它。

非常感谢您的帮助！

Answer 1

library(dplyr)
df %>% gather(key, value, -Chrom) %>% # gather turn dataset from wide to long format by collapse (collect) values in all columns 
                                      #except Chrom into two columns key and value. See ?gather for more info
       dplyr::select(-Chrom) %>%      #select all columns except Chrom i.e. key and value 
       table()                        # count the number of each unique pear

         value
 key       ./. 0/0 0/1 1/1
  sample1   2   3   1   1
  sample2   1   3   2   1
  sample3   0   2   3   2

数据

df <- read.table(text="
      Chrom    sample1    sample2    sample3
             Contig12    0/0     0/0     0/1
             Contig12    ./.     ./.     0/0
             Contig28    0/0     0/0     0/0
             Contig28    1/1     1/1     1/1
             Contig55    0/0     0/0     0/1
             Contig55    0/1     0/1     0/1
             Contig61    ./.     0/1     1/1
              ",header=T, stringsAsFactors = F)

使用r跨多个列的频率计数

1 个答案:

数据