我正在尝试将一个函数应用于此列表的每个组合:
> c('EAS_MAF', 'AMR_MAF', 'AFR_MAF', 'EUR_MAF', 'SAS_MAF')
[1] "EAS_MAF" "AMR_MAF" "AFR_MAF" "EUR_MAF" "SAS_MAF"
要安排2的每个组合中的值,我使用的是combn
函数:
> list <- combn(c('EAS_MAF', 'AMR_MAF', 'AFR_MAF', 'EUR_MAF', 'SAS_MAF'),2)
> list
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "EAS_MAF" "EAS_MAF" "EAS_MAF" "EAS_MAF" "AMR_MAF" "AMR_MAF" "AMR_MAF" "AFR_MAF" "AFR_MAF" "EUR_MAF"
[2,] "AMR_MAF" "AFR_MAF" "EUR_MAF" "SAS_MAF" "AFR_MAF" "EUR_MAF" "SAS_MAF" "EUR_MAF" "SAS_MAF" "SAS_MAF"
函数本身计算满足特定条件的行数并返回一个列表:
sharedCalc.func <- function(pop1, pop2, table = variantTable){
S.count = sum(table[pop1]>0 & table[pop2]>0 &
table['consequence'] == 'synonymous SNV')
NS.count = sum(table[pop1]>0 & table[pop2]>0 &
table['consequence'] != 'synonymous SNV')
counts <- list("NS" = NS.count, "S" = S.count, "NS/S" = NS.count/S.count)
return(counts)
}
以下是此函数的示例输出:
> sharedCalc.func('EAS_MAF', 'AMR_MAF')
$NS
[1] 59325
$S
[1] 43434
$`NS/S`
[1] 1.365865
要在我的列表中运行此功能,我认为apply
函数最合适。但是,这会返回一个不符合数组的错误:
> apply(list, 2, sharedCalc.func)
Error in FUN(newX[, i], ...) : binary operation on non-conformable arrays
我还尝试了outer
函数并收到了同样的错误:
> outer(list[1,], list[2,], sharedCalc.func)
Error in FUN(X, Y, ...) : binary operation on non-conformable arrays
我不知道为什么我会收到错误。是否可能是由于从函数返回一个列表?我尝试使用lapply返回列表,但这也不起作用。以下是我的数据输入:
> dput(head(variantTable))
structure(list(CHROM = c("1", "1", "1", "1", "1", "1"), POS = c(69224L,
69428L, 69486L, 69487L, 69496L, 69521L), ID = c("rs568964432",
"rs140739101", "rs548369610", "rs568226429", "rs150690004", "rs553724620"
), REF = c("A", "T", "C", "G", "G", "T"), ALT = c("T", "G", "T",
"A", "A", "A"), AF = c(0.000399361, 0.0189696, 0.000199681, 0.000399361,
0.000998403, 0.000399361), AC = c(2L, 95L, 1L, 2L, 5L, 2L), AN = c(5008L,
5008L, 5008L, 5008L, 5008L, 5008L), consequence = c("nonsynonymous SNV",
"nonsynonymous SNV", "synonymous SNV", "nonsynonymous SNV", "nonsynonymous SNV",
"nonsynonymous SNV"), gene = c("OR4F5", "OR4F5", "OR4F5", "OR4F5",
"OR4F5", "OR4F5"), refGene_id = c("NM_001005484", "NM_001005484",
"NM_001005484", "NM_001005484", "NM_001005484", "NM_001005484"
), AA_change = c("('D', 'V')", "('F', 'C')", "('N', 'N')", "('A', 'T')",
"('G', 'S')", "('I', 'N')"), X0.fold_count = c(572L, 572L, 572L,
572L, 572L, 572L), X4.fold_count = c(141L, 141L, 141L, 141L,
141L, 141L), EAS_MAF = c(0, 0.003, 0.001, 0, 0, 0), AMR_MAF = c(0.0029,
0.036, 0, 0, 0.0014, 0.0029), AFR_MAF = c(0, 0.0015, 0, 0.0015,
0.003, 0), EUR_MAF = c(0, 0.0497, 0, 0, 0, 0), SAS_MAF = c(0,
0.0153, 0, 0, 0, 0), nonAFR_N = c(309227L, 1128036L, 262551L,
0L, 309227L, 309227L), nonAFR_weighted = c(0.0029, 0.0261704282487438,
0.001, 0, 0.0014, 0.0029)), .Names = c("CHROM", "POS", "ID",
"REF", "ALT", "AF", "AC", "AN", "consequence", "gene", "refGene_id",
"AA_change", "X0.fold_count", "X4.fold_count", "EAS_MAF", "AMR_MAF",
"AFR_MAF", "EUR_MAF", "SAS_MAF", "nonAFR_N", "nonAFR_weighted"
), row.names = c(NA, 6L), class = "data.frame")
答案 0 :(得分:2)
尝试以下方法:
l <- combn(c('EAS_MAF', 'AMR_MAF', 'AFR_MAF', 'EUR_MAF', 'SAS_MAF'),2)
l
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "EAS_MAF" "EAS_MAF" "EAS_MAF" "EAS_MAF" "AMR_MAF" "AMR_MAF"
[2,] "AMR_MAF" "AFR_MAF" "EUR_MAF" "SAS_MAF" "AFR_MAF" "EUR_MAF"
[,7] [,8] [,9] [,10]
[1,] "AMR_MAF" "AFR_MAF" "AFR_MAF" "EUR_MAF"
[2,] "SAS_MAF" "EUR_MAF" "SAS_MAF" "SAS_MAF"
mapply(sharedCalc.func, l[1,], l[2,])
EAS_MAF EAS_MAF EAS_MAF EAS_MAF AMR_MAF AMR_MAF AMR_MAF AFR_MAF
NS 1 1 1 1 2 1 1 1
S 0 0 0 0 0 0 0 0
NS/S Inf Inf Inf Inf Inf Inf Inf Inf
AFR_MAF EUR_MAF
NS 1 1
S 0 0
NS/S Inf Inf
mapply
是sapply
的多变量版本,如果您想同时遍历多个列表,则可以使用它。
作为旁注:用自己的对象覆盖内置R功能几乎总是一个坏主意。因此,调用对象list
是一个坏主意,这就是我在上面的代码中将其更改为l
的原因。
要保留列名称,可以执行以下操作:
out <- mapply(sharedCalc.func, l[1,], l[2,])
setNames(data.frame(out), mapply(paste, l[1,], l[2,], sep="-"))
EAS_MAF-AMR_MAF EAS_MAF-AFR_MAF EAS_MAF-EUR_MAF EAS_MAF-SAS_MAF
NS 1 1 1 1
S 0 0 0 0
NS/S Inf Inf Inf Inf
AMR_MAF-AFR_MAF AMR_MAF-EUR_MAF AMR_MAF-SAS_MAF AFR_MAF-EUR_MAF
NS 2 1 1 1
S 0 0 0 0
NS/S Inf Inf Inf Inf
AFR_MAF-SAS_MAF EUR_MAF-SAS_MAF
NS 1 1
S 0 0
NS/S Inf Inf
答案 1 :(得分:2)
您正在尝试将R
用作column1作为输入,然后转到column2,依此类推。
inputs <- combn(c('EAS_MAF', 'AMR_MAF', 'AFR_MAF', 'EUR_MAF', 'SAS_MAF'),2)
output <- Map(sharedCalc.func, inputs[1, ], inputs[2, ])
Map
将取inputs[1, ]
和inputs[2, ]
的第一个值作为第一次调用sharedCalc.func
的参数,并将输出保存在列表中。然后进入第二个值等,直到使用了所有值。所以
output
现在是一个包含10个命名子列表的列表。
注意:你的函数似乎有些错误,因为它不会产生它应该产生的东西。当我拨打sharedCalc.func("EAS_MAF", "AMR_MAF")
output[[1]]
# $NS
# [1] 1
# $S
# [1] 0
# $`NS/S`
# [1] Inf