我有一个数据框,格式为:
x <-
Chrom sample1 sample2 sample3 ...
Contig12 0/0 0/0 0/1
Contig12 ./. ./. 0/0
Contig28 0/0 0/0 0/0
Contig28 1/1 1/1 1/1
Contig55 0/0 0/0 0/1
Contig55 0/1 0/1 0/1
Contig61 ./. 0/1 1/1
.
.
.
有〜20000行和〜100个唯一列,我试图计算每个列(样本)中每个唯一状态发生的次数,以便得到:
sample1 sample2 sample3 ...
./. 2 1 0
0/0 3 3 2
0/1 1 2 3
1/1 1 1 2
关于如何执行此操作的任何建议?我尝试使用plyr包中的count(),但无法弄清楚如何在每一列中使用它。
非常感谢您的帮助!
答案 0 :(得分:2)
library(dplyr)
df %>% gather(key, value, -Chrom) %>% # gather turn dataset from wide to long format by collapse (collect) values in all columns
#except Chrom into two columns key and value. See ?gather for more info
dplyr::select(-Chrom) %>% #select all columns except Chrom i.e. key and value
table() # count the number of each unique pear
value
key ./. 0/0 0/1 1/1
sample1 2 3 1 1
sample2 1 3 2 1
sample3 0 2 3 2
df <- read.table(text="
Chrom sample1 sample2 sample3
Contig12 0/0 0/0 0/1
Contig12 ./. ./. 0/0
Contig28 0/0 0/0 0/0
Contig28 1/1 1/1 1/1
Contig55 0/0 0/0 0/1
Contig55 0/1 0/1 0/1
Contig61 ./. 0/1 1/1
",header=T, stringsAsFactors = F)