我有一个类似于下面例子的数据框(这是我实际数据框的一个小提取)。
frequencies <- data.frame(sex=c("female", "female", "male", "male", "female", "female", "male", "male", "female", "female", "male", "male", "female", "female", "male", "male"),
ecotype=c("Crab", "Wave", "Crab", "Wave", "Crab", "Wave", "Crab", "Wave", "Crab", "Wave", "Crab", "Wave", "Crab", "Wave", "Crab", "Wave"),
contig_ID=c("Contig100169_2367", "Contig100169_2367", "Contig100169_2367", "Contig100169_2367", "Contig100169_2367", "Contig100169_2367", "Contig100169_2367", "Contig100169_2367",
"Contig100169_2481", "Contig100169_2481", "Contig100169_2481", "Contig100169_2481", "Contig100169_2481", "Contig100169_2481", "Contig100169_2481", "Contig100169_2481"),
allele=c("p", "p", "p", "p", "q", "q", "q", "q", "p", "p", "p", "p", "q", "q", "q", "q"),
frequency=c(157, 98, 140, 65, 29, 8, 26, 9, 182, 108, 147, 80, 46, 4, 49, 4))
我想对'contig_ID'和'ecotype'的每个组合进行单独的卡方检验,测试'性'和'等位基因'之间的关联。然后,我想在表格中总结这些结果,其中包括'contig_ID'和'ecotype'的每个组合的p值。例如,从给出的示例表中,我期望一个4 p值的结果表,如下例所示。
results <- data.frame(ecotype=c("Crab", "Wave", "Crab", "Wave"),
contig_ID=c("Contig100169_2367", "Contig100169_2367", "Contig100169_2481", "Contig100169_2481"),
pvalue=c("pval", "pval", "pval", "pval"))
或者,只需将p值列添加到原始表中也是有效的,每个组合的p值只在所有相关行中重复。
我一直在尝试将lapply()
和summarise()
等功能与chisq.test()
结合使用来实现这一目标,但到目前为止还没有运气。我还试图使用类似于此的方法:R chi squared test (3x2 contingency table) for each row in a table,但也无法使其工作。
答案 0 :(得分:1)
我们可以对import greenlet from 'greenlet'
const getName = greenlet(async username => {
const url = `https://api.github.com/users/${username}`
const res = await fetch(url)
const profile = await res.json()
return profile.name
})
console.log(await getName('developit'))
和contig_ID
列进行分组,并创建一个嵌套数据框,并将数据转换为矩阵,如下所示。
ecotype
如果我们查看library(tidyverse)
frequencies2 <- frequencies %>%
group_by(contig_ID, ecotype) %>%
nest() %>%
mutate(M = map(data, function(dat){
dat2 <- dat %>% spread(sex, frequency)
M <- as.matrix(dat2[, -1])
row.names(M) <- dat2$allele
return(M)
}))
列的第一个元素,我们会发现每个组的数据都转换为矩阵。
M
从这里开始,我们可以将frequencies2$M[[1]]
# female male
# p 157 140
# q 29 26
应用于每个矩阵并拉出p值。 chisq.test
是最终输出。
frequencies3