假设我有以下data.table:
DT<-data.table("src" = rep(1:3, each = 25), "info" = paste0("some_info_", rep(1:3, each = 25)), "analyte" = rep(letters[1:5], each = 5), "time" = rep(1:5, 5), "mass" = rnorm(75))
setkey(DT, analyte)
DT
src info analyte time mass
1: 1 some_info_1 a 1 0.155264082
2: 1 some_info_1 a 2 1.084690475
3: 1 some_info_1 a 3 0.049223594
...
73: 3 some_info_3 e 3 -1.663257082
74: 3 some_info_3 e 4 0.384154026
75: 3 some_info_3 e 5 -1.470772427
我想对这个数据表进行子集化并计算两个分析物的比例,输出一个data.table形式:src,info,time,ratio
到目前为止,我已经设法提出以下内容:
DT2 <- merge(DT["b", ][, list(src, info, time, "b_mass" = mass)], DT["e", ][, list(src, info, time, "e_mass" = mass)], by = c("src", "info", "time"))
DT2[, ratio := b_mass / e_mass]
DT2
src info time b_mass e_mass ratio
1: 1 some_info_1 1 0.048748340 0.55771897 0.087406638
2: 1 some_info_1 2 0.580088770 0.01919104 30.227067059
3: 1 some_info_1 3 3.448766176 1.31206449 2.628503555
4: 1 some_info_1 4 0.141492439 0.18760652 0.754197859
5: 1 some_info_1 5 -1.298840808 0.28843784 -4.503018083
6: 2 some_info_2 1 -0.999177587 0.82712712 -1.208009708
7: 2 some_info_2 2 -1.371567195 -0.05886506 23.300190728
8: 2 some_info_2 3 -0.577922327 -0.38846122 1.487722077
9: 2 some_info_2 4 0.519693729 -0.54861732 -0.947279120
10: 2 some_info_2 5 0.002534584 1.07951446 0.002347892
11: 3 some_info_3 1 0.449420691 0.62283878 0.721568248
12: 3 some_info_3 2 -1.014887775 -0.08623683 11.768611151
13: 3 some_info_3 3 -0.990034334 0.20649892 -4.794380280
14: 3 some_info_3 4 0.935628916 -0.71373423 -1.310892595
15: 3 some_info_3 5 -2.146412498 -0.28746621 7.466660092
是否有更有效的方法来计算每个来源的两种分析物的比例?
此外,在通过data.table
绘制数据时,有没有办法使用颜色矢量?例如,
plot(0, 0, type = "n", xlim = range(DT2$time), ylim = range(DT2$ratio), xlab = "Time", ylab = "Ratio b/e")
DT2[, lines(time, ratio, col = 1:3), by = src]
谢谢!
JMB
答案 0 :(得分:2)
这将循环计算所有适当的比率
for(kk in unique(DT[['analyte']])){
DT[, (paste('ratio',kk,sep='.')) := mass / DT[force(kk)][['mass']]]
}
# create the plot
DT['b', plot(x=time,y= ratio.e,type='n')]
# add the lines (creating a colour vector `cc` first`
cc <- c('red','green','blue')
DT['b', lines(x = time,y = ratio.e, col = cc[src]), by = src]
或者,使用ggplot2
library(ggplot2)
ggplot(DT['b'], aes(x = time,y=ratio.e, colour = factor(src))) + geom_line()