Question

也许是血腥明显的，但对R来说是新的。我要合并的两个数据帧：

longtext <- c("bla bla burp bla blub", "blah bladd", "blablaz burp")
txt <- data.frame(longtext)
queries <- c("burp", "blah")
query <- data.frame(queries)

我在query中的较长文字字符串中搜索了txt中的字符串。比赛保存在样式列表中：

matches <-list(c(1,3), c(2))

列表matches的第一个索引，例如[[1]]指的是query中的第一行。第一行（1,3）中matches的内容是指txt中的搜索命中第1行和第3行。所以我想通过使用matches的索引和内容合并两个数据帧来获取：

queries; longtext        
"burp"; "bla bla burp blah blub"
"burp"; "blablaz burp"
"blah"; "blah bladd"

但是......我对索引和内容的循环不起作用。 apply()有更简单的方法吗？将提供大量数据...

matches_long <- data.frame()  
for (i in 1:length(matches)) {
  for (l in 1:length(matches[[i]])) {
    matches_long[[l]] <- data.frame(query[[i]], txt[[matches[[i]][l]]])}}

Answer 1

在我看来，您可以根据matches的大小向数据集添加行，然后只分配匹配的值

res <- query[rep(seq_along(matches), sapply(matches, length)),, drop = FALSE] 
res["longtext"] <- txt$longtext[unlist(matches)]
res
#     queries              longtext
# 1      burp bla bla burp bla blub
# 1.1    burp          blablaz burp
# 2      blah            blah bladd

在R v 3.2+中，您可以将sapply(matches, length)替换为lengths

Answer 2

@David Arenburgs答案更好，但是当我要将其粘贴到：

时

names(matches) <- queries
stack(lapply(matches, function(x){longtext[x]}))

R - 使用索引和列表内容合并两个数据帧[R]

2 个答案: