R编程XML:提取特定节点

时间:2012-07-29 19:32:31

标签: r web-scraping

我想知道如何使用R的XML包到达特定节点。以下是使用R的内置数据集mtcars的示例。

fileName <- system.file("exampleData", "mtcars.xml", package="XML") 
doc <- xmlTreeParse(fileName)
doc$doc$children$dataset

运行上面的代码给了我ff。结果:

....
 <record id="Fiat 128">32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1</record>
 <record id="Honda Civic">30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2</record>
 <record id="Toyota Corolla">33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1</record>
 <record id="Toyota Corona">21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1</record>
 <record id="Dodge Challenger">15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2</record>
 <record id="AMC Javelin">15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2</record>
 <record id="Camaro Z28">13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4</record>
 <record id="Pontiac Firebird">19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    
....

我想知道如何选择特定节点并使用xmlAttrs获取它们的值。例如,我如何选择节点:<record id="Fiat 128">或节点<record id="Honda Civic">

1 个答案:

答案 0 :(得分:6)

doc <- xmlTreeParse(fileName)
doc <- xmlParse(fileName) 
xpathSApply(doc,"//*/record[@id=\"Fiat 128\"]",xmlValue)
xpathSApply(doc,"//*/record[@id=\"Honda Civic\"]",xmlValue)

使用等同于xmlParse的{​​{1}} 有关xmlTreeParse(useInternalNodes=T)的信息,请参阅https://www.w3schools.com/xml/xpath_syntax.asp