我想使用这个包的metap 计算多个o值
我的数据框具有3个p值
> dput(head(tt))
structure(list(RS = c("rs2089177", "rs4360974", "rs6502526",
"rs8069906", "rs9905280", "rs4313843"), G = c(0.9986, 0.9738,
0.9744, 0.7184, 0.7205, 0.9804), E = c(0.7153, 0.7838, 0.7839,
0.4918, 0.4861, 0.8522), B = c(0.604716, 0.430228, 0.42916, 0.521452,
0.465758, 0.474313)), class = c("data.table", "data.frame"), row.names = c(NA,
-6L), .internal.selfref = <pointer: 0x10200eee0>)
和数据帧,每个p值具有相应的权重 从tt数据帧开始
> dput(head(df))
structure(list(wg = c(40.6324993078201, 40.6324993078201, 40.6324993078201,
40.6324993078201, 40.6324993078201, 40.6324993078201), we = c(35.3977400408557,
35.3977400408557, 35.3977400408557, 35.3977400408557, 35.3977400408557,
35.3977400408557), wb = c(580.643608420863, 580.643608420863,
580.643608420863, 580.643608420863, 580.643608420863, 580.643608420863
), RS = c("rs2089177", "rs4360974", "rs6502526", "rs8069906",
"rs9905280", "rs4313843")), row.names = c(NA, 6L), class = "data.frame")
在df和tt中,RS列相同
如何使用此sunz()函数创建一个新的数据框 看起来与tt相同,只不过它有附加的列,例如named “ META”已计算出每一行的meta p值
这是第一行中p值有多少的示例:
> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail)
p = 0.6940048
这是我指的功能: https://www.rdocumentation.org/packages/metap/versions/1.1/topics/sumz
我尝试合并这两个数据框并在每行上应用一个函数:
> head(q)
ID P G E wb wg we
1: rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774
2: rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774
3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774
4: rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774
5: rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774
6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774
helper <- function(x) {
p <- sumz(x[2:4], weights = x[5:7])$p
p
}
q$META <- apply(q, MARGIN = 1, helper)
但我收到此错误:
Error in sumz(x[2:4], weights = x[5:7]) :
Must have at least two valid p values
答案 0 :(得分:0)
首先,由于您说RS
在两者之间是相同的,所以对我来说,这是“我们如何确定行始终正确对齐的警告?” 的警告。为防御起见,我会说“不是100%”,然后将它们合并/合并在一起,以便按正确的顺序保证它们。
quux <- tt[df, on="RS"]
quux
# RS G E B wg we wb
# 1: rs2089177 0.9986 0.7153 0.604716 40.6325 35.39774 580.6436
# 2: rs4360974 0.9738 0.7838 0.430228 40.6325 35.39774 580.6436
# 3: rs6502526 0.9744 0.7839 0.429160 40.6325 35.39774 580.6436
# 4: rs8069906 0.7184 0.4918 0.521452 40.6325 35.39774 580.6436
# 5: rs9905280 0.7205 0.4861 0.465758 40.6325 35.39774 580.6436
# 6: rs4313843 0.9804 0.8522 0.474313 40.6325 35.39774 580.6436
从这里开始,对于每行,它只是将行的每个部分与同一行的其他部分一起应用:
quux$META <- sapply(seq_len(nrow(quux)), function(rn) {
unlist(sumz(as.matrix(quux[,.(G,E,B)])[rn,], weights = as.vector(quux[,.(wg,we,wb)])[rn,],
na.action=na.fail)["p"])
})
quux
# RS G E B wg we wb META
# 1: rs2089177 0.9986 0.7153 0.604716 40.6325 35.39774 580.6436 0.9863582
# 2: rs4360974 0.9738 0.7838 0.430228 40.6325 35.39774 580.6436 0.9294546
# 3: rs6502526 0.9744 0.7839 0.429160 40.6325 35.39774 580.6436 0.9300445
# 4: rs8069906 0.7184 0.4918 0.521452 40.6325 35.39774 580.6436 0.6379392
# 5: rs9905280 0.7205 0.4861 0.465758 40.6325 35.39774 580.6436 0.6055061
# 6: rs4313843 0.9804 0.8522 0.474313 40.6325 35.39774 580.6436 0.9605584
或更像data.table
中心的方式:
mysumz <- function(x, w) sumz(unlist(x), weights = unlist(w), na.action = na.fail)[["p"]]
quux[, META := mysumz(.(G,E,B), .(wg,we,wb)), by = seq_len(nrow(quux))]
(从https://stackoverflow.com/a/36802640借用)。要求使用辅助功能是因为对mysumz
和list
的每个对x
的调用都有一个w
,但是sumz
需要向量。如果要验证这一点,请先调用debugonce(mysumz)
,然后运行quux[,META:=...]
并检查x
和w
...及其工作方式。