我有一个列表aa
,它引用另一个列表bb
的索引名称以及包含另一个元素(称为cm
)。列表bb
项包含字符串。我有一个遍历bb
的循环,并且对于与我指定的字符串匹配的每个项目,将其添加到数据框中的新行。我还需要将cm
值添加到该行。
示例:
library("tidyverse")
aa <- list(c(123, 1), c(234, 1), c(345, 2), c(456, 3))
bb <- list("123" = c("a", "b", "c"), "234" = c("b", "c", "d"), "345" = c("c", "d", "e"), "456" = c("f", "g", "h"))
cc <- c("a", "b", "c")
tbl <- NULL
for (a in aa){
for (b in bb) {
if (any(cc %in% b)) {
tb <- tibble(cm=a[2],n1=b[1],n2=b[2],n3=b[3])
tbl <- bind_rows(tbl,tb)
}
}
}
这会针对bb
的每个可能组合进行迭代,并将其添加到每个cm
对,这是不好的。我的输出应该是这样的:
output <- tibble(cm = c(1, 1, 2), n1 = c("a", "b", "c"),
n2 = c("b", "c", "d"), n3 = c("c", "d", "e"))
> output
# A tibble: 3 x 4
cm n1 n2 n3
<dbl> <chr> <chr> <chr>
1 1 a b c
2 1 b c d
3 2 c d e
我认为这样的事情可能会起作用,至少我可以稍后循环tbl
并使用nm
将其替换为适当的cm
值:
tbl <- NULL
for (a in aa){
for (b in bb) {
if (any(cc %in% b)) {
tb <- tibble(nm = names(bb)[b], n1=b[1],n2=b[2],n3=b[3])
tbl <- bind_rows(tbl,tb)
}
}
}
我真的不明白为什么这不起作用,因为names(bb)[1]
会返回123
所以我认为它在names(bb)[b]
的循环中会起作用。
答案 0 :(得分:0)
如果您对没有显式循环的基本R解决方案感到满意,这会起作用吗?
# generate data
aa <- list(c(123, 1), c(234, 1), c(345, 2), c(456, 3))
# cm is an element of bb
bb <- list("123" = c("a", "b", "c"), "234" = c("b", "c", "d"),
"345" = c("c", "d", "e"), "456" = c("f", "g", "h"),
cm = c(1, 1, 2))
cc <- c("a", "b", "c")
tbl <- data.frame(
bb[["cm"]],
# apply to each element of aa
do.call(rbind, lapply(aa, function(x, y, c) { # function takes 3 args
# only elements of bb whose names are in aa[[x]]
names_y <- as.character(intersect(x, names(y)))
# turn subset of bb into data.frame
out <- as.data.frame(do.call(rbind, y[names_y]))
# subset rows for which any row element %in% cc
out <- out[apply(out, 1, function(x, c) any(x %in% c), c)]
return(out)
}, bb, cc))) # pass bb and cc as args to the function in lapply()
names(tbl) <- c("cm", paste0("n", 1:(ncol(tbl) - 1)))
给出
> tbl
cm n1 n2 n3
123 1 a b c
234 1 b c d
345 2 c d e