如何在R中将xml文件转换为数据框

时间:2019-02-20 20:09:21

标签: r xml dataframe

我正在尝试解析此xml并将其放在数据框形式上:

文件内容如下:

 <?xml version="1.0" encoding="utf-8" ?> 
- <dashboardreport name="Incident_Rules" version="7.2.5.1022" reportdate="2019-02-20T14:45:57.352-05:00" description="">
- <source name="app1">
- <filters summary="last 30 minutes (auto)">
  <filter>tf:DiagnoseTimeframe?1550690157352:1550691957352</filter> 
  </filters>
  </source>
- <reportheader>
- <reportdetails>
  <user>user1</user> 
  </reportdetails>
  </reportheader>
- <data>
- <incidentchartdashlet name="Incident Chart" description="">
- <incidentchartrecords structuretype="tree">
  <incidentchartrecord rule="Database Exception" systemprofile="app1" /> 
  <incidentchartrecord rule="Response time greater than 30 minutes" systemprofile="app1" /> 
  <incidentchartrecord rule="JVM Heap Utilization > 90%" systemprofile="app1" /> 
  </incidentchartrecords>
  </incidentchartdashlet>
  </data>
  </dashboardreport>

数据框必须是这样的:

Source Name      Rule
App1         Database Exception
App1         Response time greater than 30 minutes
App1         JVM Heap Utilization > 90%

需要提取“源名称”和“事件记录规则”。我已经尝试过这样的事情:

library("XML")
doc <- read_xml(file)
  dat<-xml_find_all(doc, ".//incidentchartrecord") %>%
    map_df(function(x) {
      xml_find_all(x, ".//incidentchartrecord") %>%
        map_df(~as.list(xml_attrs(.))) %>%
        select(rule) %>%
        mutate(node=xml_attr(x, "incidentchartrecord"))
    })

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

这是一种可行的方法。我改用xml2;在此找到xml_find_allxml_attr函数。

library(xml2)
doc <- read_xml("test.xml")
source <- xml_attr(xml_find_all(doc,".//source"), "name")
rules <- xml_attr(xml_find_all(doc, ".//incidentchartrecord"), "rule")
df <- data.frame("Source.Name" = source, Rule=rules, stringsAsFactors=F)