R - 如何在循环

时间:2018-06-03 14:28:36

标签: r list

我想创建列表列表。一个特定列表应包含来自我的文件夹中的一个特定xml文件的所有关键字。这意味着列表数量是相同数量的文件。问题是我不知道文件夹中有多少文件。我尝试在循环中创建列表列表,如下所示:

my_keywords <-
  list(my_keywords,m) 

但结果有太多嵌套列表。所以我尝试创建列表矩阵和循环后转换矩阵到列表列表。这是我的代码:

  #read all xml files in folder data
    f <- list.files(path = "C:\\data\\", pattern = "*.xml", all.files = FALSE,
               full.names = TRUE, recursive = FALSE)

    keywords_matrix <- matrix("", ncol=1, nrow = length(f))
    i<-0

    #for each xml file read and save all keywords
    for (sig in f) {
        i<-i+1
        data_xml <- xmlTreeParse(sig,useInternalNodes=TRUE)
        xml_list <- xpathApply(data_xml, "//keyword", xmlValue)
        #every keyword is in own list, give all keywords to one list
        m <- unlist(xml_list)
        #every keywords list in one row of matrix
        keywords_matrix[i,] <-list(m)

    }

    print (keywords_matrix)
    mylist <- apply(keywords_matrix, 1, as.list)

但我的代码不起作用。它给了我这些错误:

> Error in keywords_matrix[i, ] <- list(m) : 
>  incorrect number of subscripts on matrix

---

>    Error in apply(keywords_matrix, 1, as.list) : 
>       dim(X) must have a positive length

我的矩阵看起来像:

[[1]]
[1] "a11" "a12" "a13"        

[[2]]
[1]  ""

[[3]]
[1]  "" 

我想要的是mylist,如下所示:

[[1]]
"a11" "a12" "a13"        

[[2]]
"b11"        "b12"       "b13" "b14" 

[[3]]
 "c11"        "c12"      

有任何帮助吗?因为我不知道为什么这不起作用。矩阵中的索引对我来说很好看。

我的xml文件如下所示:

<rule id="1">
    <date>2018-01-12</date>
    <name>name of A element</name>
    <allkeywords>  
        <keyword>a11</keyword>
    <keyword>a12</keyword>
     <keyword>a13</keyword>    
     </allkeywords>  
</rule>

并且:

  <rule id="2">
        <date>2018-01-12</date>
        <name>name of B element</name>
        <allkeywords>  
            <keyword>b11</keyword>
        <keyword>b12</keyword>
         <keyword>b13</keyword>  
        <keyword>b14</keyword> 
         </allkeywords>  
    </rule>

2 个答案:

答案 0 :(得分:2)

您的预期结果不是嵌套列表,它只是常规的向量列表。您的代码可以简单地通过初始化一个空列表并在循环时向其添加每个元素来工作。

library(XML) 

f <- list.files(pattern = "*.xml", all.files = FALSE,
                  full.names = TRUE, recursive = FALSE)
i<-0
mylist<-list() #initialize list 

for (sig in f) {
  i<-i+1
  data_xml <- xmlTreeParse(sig,useInternalNodes=TRUE)
  xml_list <- xpathApply(data_xml, "//keyword", xmlValue)
  m <- unlist(xml_list)
  mylist[[i]]<-m #add each element to list
}

mylist 

[[1]]
[1] "a11" "a12" "a13"

[[2]]
[1] "b11" "b12" "b13" "b14"

或者你可以用lapply

来做
mylist<-lapply(f, function(x){
  data_xml <- xmlTreeParse(x,useInternalNodes=TRUE)
  xml_list <- xpathApply(data_xml, "//keyword", xmlValue)
  m <- unlist(xml_list)
})

答案 1 :(得分:1)

只需使用XML lapply在XML文件列表上运行xpathSApply即可。无需使用矩阵作为中间辅助容器。

library(XML)

#read all xml files in folder data
f <- list.files(path = "C:\\data\\", pattern = "*.xml", all.files = FALSE,
                full.names = TRUE, recursive = FALSE)

mylist <- lapply(f, function(i){
  data_xml <- xmlTreeParse(i, useInternalNodes=TRUE)
  xml_list <- xpathSApply(data_xml, "//keyword", xmlValue)
})

mylist

# [[1]]
# [1] "a11" "a12" "a13"

# [[2]]
# [1] "b11" "b12" "b13" "b14"