一种更有效的方法来解析嵌套列表的元素

时间:2014-07-12 07:10:50

标签: r

我正在开发a function that parses a nested list。不幸的是,由于原始数据的性质,我真的无法想到这样做。函数中的最后三位代码吓到了我一点,但他们确实完成了工作。他们在这里:

mkList <- lapply(rec, function(x){
      lapply(regex, function(y) grep(y, x, value = TRUE)) })
rem <- lapply(mkList, function(x){
      lapply(x, function(y) sub("[a-z]+,", "", y)) })
lapply(rem, read.as.csv)

是的,你看到的正确,它连续5次拨打lapply。是的,你猜对了,read.as.csv也调用了lapply


要制作一个可重复的小例子,请考虑嵌套列表x和下一个“双”lapply块。结果正是我想要的,但我很好奇

是否有更好,更有效的方法将函数应用于嵌套列表的内部列表?

内部列表元素是不同字符串长度的csv向量。

> ( x <- list(list(a = c("a,b,c", "d,e,f"), 
                   b = c("1,2,a,b,c,d", "3,4,e,f,g,h"))) )

# [[1]]
# [[1]]$a
# [1] "a,b,c" "d,e,f"
#
# [[1]]$b
# [1] "1,2,a,b,c,d" "3,4,e,f,g,h"

> lapply(x, function(y){
      lapply(y, function(z) do.call(rbind, strsplit(z, ",")))
  })

# [[1]]
# [[1]]$a
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c" 
# [2,] "d"  "e"  "f" 
# 
# [[1]]$b
#      [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] "1"  "2"  "a"  "b"  "c"  "d" 
# [2,] "3"  "4"  "e"  "f"  "g"  "h" 

1 个答案:

答案 0 :(得分:2)

*apply系列中鲜为人知的函数中有rapply - 对于&#34;递归lapply&#34;。您似乎正在尝试这样做:

rapply(x, function(y) do.call(rbind, strsplit(y, ",", TRUE)), how = "replace")
# [[1]]
# [[1]]$a
#      [,1] [,2] [,3]
# [1,] "a"  "b"  "c" 
# [2,] "d"  "e"  "f" 
# 
# [[1]]$b
#      [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] "1"  "2"  "a"  "b"  "c"  "d" 
# [2,] "3"  "4"  "e"  "f"  "g"  "h" 

对于这个特殊的例子,它背后隐藏着你的方法,但随着你扩展这个例子,它被证明更有效率。