从字符串创建嵌套列表结构

时间:2015-11-11 13:41:01

标签: r string list

我有一个由n个子串组成的字符串。它看起来像这样:

string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")

此字符串中的每个子组件都通过“_”与任何其他子组件分开。这里,第一级包括值“A”和“B”,第二级“AA”,“BB”和“CC”,第三级“AAA”。可以进行更深入的嵌套,解决方案应该扩展到这些情况。嵌套不一定是平衡的,例如“A”只有两个孩子,而“B”有三个孩子,但它也有一个“B”没有的孙子。

基本上,我想在某个R对象中重新创建此字符串中的嵌套结构,最好是列表。因此,嵌套列表结构如下所示:

list("A" = list("AA", "BB" = list("AAA")),
"B" = list("AA", "BB", "CC"))

> $A
  $A[[1]]

  [1] "AA"
  $A$BB
  $A$BB[[1]]
  [1] "CCC"

  $B
  $B[[1]]
  [1] "AA"

  $B[[2]]
  [1] "BB"

  $B[[3]]
  [1] "CC"

对此有任何帮助表示赞赏

3 个答案:

答案 0 :(得分:2)

你可以把它变成一个没有太多大惊小怪的矩阵......

string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")

splitted<-strsplit(string,"_")
cols<-max(lengths(splitted))
mat<-do.call(rbind,lapply(splitted, "length<-", cols))

答案 1 :(得分:1)

不是那么直接,也不是最漂亮的代码,但它应该完成它的工作并返回一个列表:

string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")

# loop through each element of the string "str_el"
list_els <- lapply(string, function(str_el) {

  # split the string into parts
  els <- strsplit(str_el, "_")[[1]]

  # loop backwards through the elements
  for (i in length(els):1){

    # the last element gives the value
    if (i == length(els)){

      # assign the value to a list and rename the list          
      res <- list(els[[i]])
      names(res) <- els[[i - 1]]

    } else {
      # if its not the last element (value) assign the list res to another list
      # with the name of that element
      if (i != 1) {
        res <- list(res)
        names(res) <- els[[i - 1]]
      }
    }
  }

  return(res)
})

# combine the lists into one list
res_list <- mapply(c, list_els, SIMPLIFY = F)

res_list
# [[1]]
# [[1]]$A
# [1] "AA"
# 
# 
# [[2]]
# [[2]]$A
# [1] "BB"
# 
# 
# [[3]]
# [[3]]$A
# [[3]]$A$BB
# [1] "AAA"
# 
# 
# 
# [[4]]
# [[4]]$B
# [1] "AA"
# 
# 
# [[5]]
# [[5]]$B
# [1] "BB"
# 
# 
# [[6]]
# [[6]]$B
# [1] "CC"

这会给你你想要的吗?

答案 2 :(得分:0)

我找到了这种方法。很奇怪,但似乎可以正常工作

my_relist <- function(x){
y=list()
#This first loop creates the skeleton of the list
for (name in x){
    split=strsplit(name,'_',fixed=TRUE)[[1]]
    char='y'
    l=length(split)
    for (i in 1:(l-1)){
        char=paste(char,'$',split[i],sep="")
    }
char2=paste(char,'= list()',sep="")
#Example of char2: "y$A$BB=list()"
eval(parse(text=char2))
#Evaluates the expression inside char2
}

#The second loop fills the list with the last element
for (name in x){
   split=strsplit(name,'_',fixed=TRUE)[[1]]
   char='y'
   l=length(split)
   for (i in 1:(l-1)){
       char=paste(char,'$',split[i],sep="")
   }
char3=paste(char,'=c(',char,',split[l])')
#Example of char3: "y$A = c(y$A,"BB")"
eval(parse(text=char3))
}
return(y)
}

这是结果:

example <- c("A_AA_AAA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")
my_relist(example)
#$A
#$BB
#1.'AAA'
#[[2]]
#'AA'
#[[3]]
#'BB'
#$B
#1.'AA'
#2.'BB'
#3.'CC'