我有以下data.table。
dat <- structure(list(kmers = c("TTTTTTTTTTTT", "TCCATTCCATTC", "TTCCATTCCATT",
"CCATTCCATTCC", "ATTCCATTCCAT", "CATTCCATTCCA", "TTTTATTATTTT",
"AAAATTATAAAA", "AAGACAATTTCT", "AAAGACAATTTC"), counts = c(16361L,
10090L, 9599L, 9021L, 8516L, 8325L, 5739L, 5642L, 5378L, 5326L
)), .Names = c("kmers", "counts"), class = c("data.table", "data.frame"
), row.names = c(NA, -10L), .internal.selfref = <pointer: 0x29f1d78>)
这是表格
kmers counts
1: TTTTTTTTTTTT 16361
2: TCCATTCCATTC 10090
3: TTCCATTCCATT 9599
4: CCATTCCATTCC 9021
5: ATTCCATTCCAT 8516
6: CATTCCATTCCA 8325
7: TTTTATTATTTT 5739
8: AAAATTATAAAA 5642
9: AAGACAATTTCT 5378
10: AAAGACAATTTC 5326
我想将列数除以所有计数的总和。对于数据帧我会做
total=sum(dat$counts)
freq <- dat$counts/total
我如何为data.table做到这一点?每个kmers都是唯一的,所以我不希望在kmers列中有重复的值。
例如,对于第一行,它将是16361/sum(dat$counts)
。
答案 0 :(得分:0)
或者使用普通的基本语法仍然有效:
dat$countProportion = dat$counts / sum(dat$counts)