情况:
以下是我的数据:
> head(data1)
CHROM POS REF ALT DIFF GT
1 chr01 14653 C T 254 CT
2 chr01 14907 A G 254 AG
3 chr01 14930 A G 23 AG
4 chr01 15190 G A 260 GA
5 chr01 15211 T G 21 TG
6 chr01 16378 T C 1167 TC
> tail(data1)
154176 chrX 154901366 T A 58700 TA
154177 chrX 154901404 A T 38 AT
154178 chrX 154933406 A G 32002 AG
154179 chrX 154933419 A T 13 AT
154180 chrX 154933451 T C 32 TC
154181 chrX 154933473 G T 22 GT
我想做什么:
我现在的代码只能获得按POS组分组的平均值,而不能获得CHROM组。
代码:
datsum <- ddply(data1, .var = "POSgroup", .fun = function(x) {
# Calculate the mean DIFF value for each GT group in this POSgroup
meandiff <- ddply(x, .var = "GT", .fun = summarise, ymean = mean(DIFF))
# Add the center of the POSgroup range as the x position
meandiff$center <- (x$POSgroup[1] * 1e7) + 0.5e7
# Return the results
meandiff
})
任何人都可以帮我吗?
答案 0 :(得分:3)
使用data.table
,这将为您提供一个起点:
library(data.table)
dt = data.table(data1)
dt[, mean(DIFF), by = list(floor(CHROM/1e7), floor(POS/1e7))]