我正在尝试解析此xml并将其放在数据框形式上:
文件内容如下:
<?xml version="1.0" encoding="utf-8" ?>
- <dashboardreport name="Incident_Rules" version="7.2.5.1022" reportdate="2019-02-20T14:45:57.352-05:00" description="">
- <source name="app1">
- <filters summary="last 30 minutes (auto)">
<filter>tf:DiagnoseTimeframe?1550690157352:1550691957352</filter>
</filters>
</source>
- <reportheader>
- <reportdetails>
<user>user1</user>
</reportdetails>
</reportheader>
- <data>
- <incidentchartdashlet name="Incident Chart" description="">
- <incidentchartrecords structuretype="tree">
<incidentchartrecord rule="Database Exception" systemprofile="app1" />
<incidentchartrecord rule="Response time greater than 30 minutes" systemprofile="app1" />
<incidentchartrecord rule="JVM Heap Utilization > 90%" systemprofile="app1" />
</incidentchartrecords>
</incidentchartdashlet>
</data>
</dashboardreport>
数据框必须是这样的:
Source Name Rule
App1 Database Exception
App1 Response time greater than 30 minutes
App1 JVM Heap Utilization > 90%
需要提取“源名称”和“事件记录规则”。我已经尝试过这样的事情:
library("XML")
doc <- read_xml(file)
dat<-xml_find_all(doc, ".//incidentchartrecord") %>%
map_df(function(x) {
xml_find_all(x, ".//incidentchartrecord") %>%
map_df(~as.list(xml_attrs(.))) %>%
select(rule) %>%
mutate(node=xml_attr(x, "incidentchartrecord"))
})
有什么想法吗?
答案 0 :(得分:1)
这是一种可行的方法。我改用xml2
;在此找到xml_find_all
和xml_attr
函数。
library(xml2)
doc <- read_xml("test.xml")
source <- xml_attr(xml_find_all(doc,".//source"), "name")
rules <- xml_attr(xml_find_all(doc, ".//incidentchartrecord"), "rule")
df <- data.frame("Source.Name" = source, Rule=rules, stringsAsFactors=F)