我正在尝试从R中的各种安全控件构建数据混搭。我在输出CSV,JSON等设备方面取得了巨大成功,但XML确实让我感到沮丧。你会很快发现我不是我希望成为的老板R开发者,但我非常感谢曾经提供的任何帮助。这是我试图解析的XML的简化版本。
public static BigInteger sum(long... numbers) {
BigInteger total = BigInteger.ZERO;
for(long number : numbers) {
total = total.add(BigInteger.valueOf(number));
}
return total;
}
我希望实现的最终结果是一个整洁的R数据帧,我可以用它进行分析。这是一个完美的世界,如下所示
<devices>
<host id="169274" persistent_id="21741">
<ip>some_IP_here</ip>
<hostname>Some_DNS_name_here </hostname>
<netbiosname>Some_NetBios_Name_here</netbiosname>
<hscore>663</hscore>
<howner>4</howner>
<assetvalue>4</assetvalue>
<os>Unix Variant</os>
<nbtshares/>
<fndvuln id="534" port="80" proto="tcp"/>
<fndvuln id="1191" port="22" proto="tcp"/>
</host>
<host id="169275" persistent_id="21003">
<ip>some_IP_here</ip>
<hostname>Some_DNS_name_here </hostname>
<netbiosname>Some_NetBios_Name_here</netbiosname>
<hscore>0</hscore>
<howner>4</howner>
<assetvalue>4</assetvalue>
<os>OS Undetermined</os>
<nbtshares/>
<fndvuln id="5452" port="ip" proto="ip"/>
<fndvuln id="5092" port="123" proto="udp"/>
<fndvuln id="16157" port="123" proto="udp"/>
</host>
</devices>
在最简单的层面上,我解析XML并提取构建基本数据帧所需的数据没有问题。但是,我很难解决如何迭代解析的XML,并且每次fndvuln元素出现在父XML节点中时都会创建一个单独的行。
到目前为止,我猜测最好分别加载每个元素,然后在最后绑定它们。我想这将允许我使用sapply来运行fndvuln的各种实例并创建一个单独的条目。到目前为止,我有基本结构:
host ip hostname netbiosname VulnID port protocol
1 169274 some_IP_here Some_DNS_name_here Some_NetBios_Name_here 534 80 tcp
2 169274 some_IP_here Some_DNS_name_here Some_NetBios_Name_here 1191 22 tcp
3 169275 some_IP_here Some_DNS_name_here Some_NetBios_Name_here 5452 ip ip
4 169275 some_IP_here Some_DNS_name_here Some_NetBios_Name_here 5092 123 udp
5 169275 some_IP_here Some_DNS_name_here Some_NetBios_Name_here 16157 123 udp
这基本上给了我这个:
library(XML)
setwd("My_file_location_here")
xmlfile <- "vuln.xml"
xmldoc <- xmlParse(xmlfile)
vuln <-getNodeSet(xmldoc, "//host")
x <- lapply(vuln, function(x) data.frame(host = xpathSApply(x, "." , xmlGetAttr, "id"),
ip = xpathSApply(x, ".//ip", xmlValue),
hostname = xpathSApply(x, ".//hostname", xmlValue),
netbiosname = xpathSApply(x, ".//netbiosname", xmlValue) ))
do.call("rbind", x)
我不确定如何做其余的事情。此外,因为这个设备会发出相当大的XML文件,知道如何有效地做到这一点将是我的最终目标。
答案 0 :(得分:0)
将fndvuln元素添加到data.frame时,将重复使用host,ip,hostname等(try data.frame("a", 1:3)
)
x <- lapply(vuln, function(x) data.frame(
host = xpathSApply(x, "." , xmlGetAttr, "id"),
ip = xpathSApply(x, ".//ip", xmlValue),
hostname = xpathSApply(x, ".//hostname", xmlValue),
VulnID = xpathSApply(x, ".//fndvuln" , xmlGetAttr, "id"),
port = xpathSApply(x, ".//fndvuln" , xmlGetAttr, "port") ))
do.call("rbind", x)
host ip hostname VulnID port
1 169274 some_IP_here Some_DNS_name_here 534 80
2 169274 some_IP_here Some_DNS_name_here 1191 22
3 169275 some_IP_here Some_DNS_name_here 5452 ip
4 169275 some_IP_here Some_DNS_name_here 5092 123
5 169275 some_IP_here Some_DNS_name_here 16157 123