Question

我有三个文本文档存储为名为“dlist”的列表列表：

dlist <- structure(list(name = c("a", "b", "c"), text = list(c("the", "quick", "brown"), c("fox", "jumps", "over", "the"), c("lazy", "dog"))), .Names = c("name", "text"))

在我脑海中，我发现像这样的图片列表很有帮助：

   name  text
1  a     c("the", "quick", "brown")
2  b     c("fox", "jumps", "over", "the")
3  c     c("lazy", "dog")

如何操纵如下？想法是绘制图形，因此可以为ggplot2融化的东西会很好。

  name  text
1    a   the
2    a quick
3    a brown
4    b   fox
5    b jumps
6    b  over
7    b   the
8    c  lazy
9    c   dog

每个单词都有一行，同时给出单词及其父文档。

我试过了：

> expand.grid(dlist)
  name                  text
1    a     the, quick, brown
2    b     the, quick, brown
3    c     the, quick, brown
4    a fox, jumps, over, the
5    b fox, jumps, over, the
6    c fox, jumps, over, the
7    a             lazy, dog
8    b             lazy, dog
9    c             lazy, dog

> sapply(seq(1,3), function(x) (expand.grid(dlist$name[[x]], dlist$text[[x]])))
     [,1]     [,2]     [,3]    
Var1 factor,3 factor,4 factor,2
Var2 factor,3 factor,4 factor,2

unlist(dlist)
  name1   name2   name3   text1   text2   text3   text4 
    "a"     "b"     "c"   "the" "quick" "brown"   "fox" 
  text5   text6   text7   text8   text9 
"jumps"  "over"   "the"  "lazy"   "dog"

> sapply(seq(1,3), function(x) (cbind(dlist$name[[x]], dlist$text[[x]])))
[[1]]
     [,1] [,2]   
[1,] "a"  "the"  
[2,] "a"  "quick"
[3,] "a"  "brown"

[[2]]
     [,1] [,2]   
[1,] "b"  "fox"  
[2,] "b"  "jumps"
[3,] "b"  "over" 
[4,] "b"  "the"  

[[3]]
     [,1] [,2]  
[1,] "c"  "lazy"
[2,] "c"  "dog"

公平地说，我被各种apply和plyr函数所迷惑，并且真的不知道从哪里开始。我从来没有见过像上面的“sapply”尝试那样的结果，也不理解它。

Answer 1

如果您将dlist转换为命名列表（我认为更合适的结构），您可以使用stack()获取所需的两列data.frame。

（第二行中的rev()和setNames()调用只是调整列排序和名称以匹配问题中显示的所需输出的众多方法之一。）

x <- setNames(dlist$text, dlist$name)
setNames(rev(stack(x)),  c("name", "text"))
#   name  text
# 1    a   the
# 2    a quick
# 3    a brown
# 4    b   fox
# 5    b jumps
# 6    b  over
# 7    b   the
# 8    c  lazy
# 9    c   dog

Answer 2

另一种解决方案，可能更具概括性：

do.call(rbind, do.call(mapply, c(dlist, FUN = data.frame, SIMPLIFY = FALSE)))

#     name  text
# a.1    a   the
# a.2    a quick
# a.3    a brown
# b.1    b   fox
# b.2    b jumps
# b.3    b  over
# b.4    b   the
# c.1    c  lazy
# c.2    c   dog

Answer 3

乔希的回答更甜美，但我以为我会戴上帽子。

dlist <- structure(list(name = c("a", "b", "c"), 
    text = list(c("the", "quick", "brown"), 
    c("fox", "jumps", "over", "the"), c("lazy", "dog"))), 
    .Names = c("name", "text"))

lens <- sapply(unlist(dlist[-1], recursive = FALSE), length)

data.frame(name = rep(dlist[[1]], lens), text = unlist(dlist[-1]), row.names = NULL)

##   name  text
## 1    a   the
## 2    a quick
## 3    a brown
## 4    b   fox
## 5    b jumps
## 6    b  over
## 7    b   the
## 8    c  lazy
## 9    c   dog

据说列表清单是一种尴尬的存储方法。向量列表（特别是命名的向量列表）将更容易处理。

类似于列表列表中的expand.grid

3 个答案: