我有三个文本文档存储为名为“dlist”的列表列表:
dlist <- structure(list(name = c("a", "b", "c"), text = list(c("the", "quick", "brown"), c("fox", "jumps", "over", "the"), c("lazy", "dog"))), .Names = c("name", "text"))
在我脑海中,我发现像这样的图片列表很有帮助:
name text
1 a c("the", "quick", "brown")
2 b c("fox", "jumps", "over", "the")
3 c c("lazy", "dog")
如何操纵如下?想法是绘制图形,因此可以为ggplot2融化的东西会很好。
name text
1 a the
2 a quick
3 a brown
4 b fox
5 b jumps
6 b over
7 b the
8 c lazy
9 c dog
每个单词都有一行,同时给出单词及其父文档。
我试过了:
> expand.grid(dlist)
name text
1 a the, quick, brown
2 b the, quick, brown
3 c the, quick, brown
4 a fox, jumps, over, the
5 b fox, jumps, over, the
6 c fox, jumps, over, the
7 a lazy, dog
8 b lazy, dog
9 c lazy, dog
> sapply(seq(1,3), function(x) (expand.grid(dlist$name[[x]], dlist$text[[x]])))
[,1] [,2] [,3]
Var1 factor,3 factor,4 factor,2
Var2 factor,3 factor,4 factor,2
unlist(dlist)
name1 name2 name3 text1 text2 text3 text4
"a" "b" "c" "the" "quick" "brown" "fox"
text5 text6 text7 text8 text9
"jumps" "over" "the" "lazy" "dog"
> sapply(seq(1,3), function(x) (cbind(dlist$name[[x]], dlist$text[[x]])))
[[1]]
[,1] [,2]
[1,] "a" "the"
[2,] "a" "quick"
[3,] "a" "brown"
[[2]]
[,1] [,2]
[1,] "b" "fox"
[2,] "b" "jumps"
[3,] "b" "over"
[4,] "b" "the"
[[3]]
[,1] [,2]
[1,] "c" "lazy"
[2,] "c" "dog"
公平地说,我被各种apply和plyr函数所迷惑,并且真的不知道从哪里开始。我从来没有见过像上面的“sapply”尝试那样的结果,也不理解它。
答案 0 :(得分:11)
如果您将dlist
转换为命名列表(我认为更合适的结构),您可以使用stack()
获取所需的两列data.frame。
(第二行中的rev()
和setNames()
调用只是调整列排序和名称以匹配问题中显示的所需输出的众多方法之一。)
x <- setNames(dlist$text, dlist$name)
setNames(rev(stack(x)), c("name", "text"))
# name text
# 1 a the
# 2 a quick
# 3 a brown
# 4 b fox
# 5 b jumps
# 6 b over
# 7 b the
# 8 c lazy
# 9 c dog
答案 1 :(得分:1)
另一种解决方案,可能更具概括性:
do.call(rbind, do.call(mapply, c(dlist, FUN = data.frame, SIMPLIFY = FALSE)))
# name text
# a.1 a the
# a.2 a quick
# a.3 a brown
# b.1 b fox
# b.2 b jumps
# b.3 b over
# b.4 b the
# c.1 c lazy
# c.2 c dog
答案 2 :(得分:0)
乔希的回答更甜美,但我以为我会戴上帽子。
dlist <- structure(list(name = c("a", "b", "c"),
text = list(c("the", "quick", "brown"),
c("fox", "jumps", "over", "the"), c("lazy", "dog"))),
.Names = c("name", "text"))
lens <- sapply(unlist(dlist[-1], recursive = FALSE), length)
data.frame(name = rep(dlist[[1]], lens), text = unlist(dlist[-1]), row.names = NULL)
## name text
## 1 a the
## 2 a quick
## 3 a brown
## 4 b fox
## 5 b jumps
## 6 b over
## 7 b the
## 8 c lazy
## 9 c dog
据说列表清单是一种尴尬的存储方法。向量列表(特别是命名的向量列表)将更容易处理。