我有一个XML文件,我想用R解析。我运行下面的代码将其解析为数据框并获取下面的输出。在数据框中,我无法获得datetime =“2016-12-15T22:45:40.000Z”。我能够获得 在数据帧中累积操作小时1059.64。我想将XML文档中的日期时间字段解析为数据帧。关于如何做的任何想法?
xmldataframe <- xmlToDataFrame("xamal.xml")
xmlfile <- xmlParse("xamal.xml")
rootnode <- xmlRoot(xmlfile)
rootsize <- xmlSize(rootnode)
print(rootsize)
[1] 103
print(rootnode[[11]][[5]])
<CumulativeOperatingHours datetime="2016-12-15T22:45:40.000Z">
<Hour>1059.60</Hour>
</CumulativeOperatingHours>
下面是我试图读入R的XML文件。这是一个长文件,所以我需要将其作为文件读入R并在R中创建一个包含属性日期和时间的数据框
<?xml version="1.0" encoding="UTF-8"?>
<Group xmlns="http://standards.is.com/is/151/-1" version="2" Time="2018-05-30T19:33:44.352Z">
<Links>
<rel>self</rel>
<href>https://cloud.com/1</href>
</Links>
<Links>
<rel>last</rel>
<href>https://cloud.com/2</href>
</Links>
<Links>
<rel>next</rel>
<href>https://cloud.com/3</href>
</Links>
<Equip>
<EquipHead>
<Name>CAST</Name>
<Model>1100</Model>
<EquipmentID>Desk</EquipmentID>
<SerialNumber>12312312</SerialNumber>
<PIN>123123</PIN>
</EquipHead>
<Location datetime="2012-06-25T11:14:54.000Z">
<Latitude>44.57</Latitude>
<Longitude>-95.51</Longitude>
</Location>
<OperatingHours datetime="2012-03-01T17:42:37.000Z">
<Hour>198.80</Hour>
</OperatingHours>
</Equip>
<Equip>
<EquipHead>
<Name>Yuza</Name>
<Model>L208</Model>
<EquipmentID>4DW772GP</EquipmentID>
<SerialNumber>4DW772GP</SerialNumber>
<PIN>1DW772GPVJF</PIN>
</EquipHead>
<Location datetime="2018-05-30T19:22:46.000Z">
<Latitude>47.518556</Latitude>
<Longitude>-70.422444</Longitude>
</Location>
<IdleHours datetime="2018-05-30T19:02:46.000Z">
<Hour>33.74</Hour>
</IdleHours>
<OperatingHours datetime="2018-05-30T19:22:48.000Z">
<Hour>72.35</Hour>
</OperatingHours>
<Distance datetime="2018-05-30T19:02:46.000Z">
<Odometer>kilometre</Odometer>
<OdometerV>30.9</OdometerV>
</Distance>
<FuelUsed datetime="2018-05-30T19:02:46.000Z">
<FuelUnits>litre</FuelUnits>
<Consumed>395</Consumed>
</FuelUsed>
</Equip>
<Equip>
<EquipHead>
<OEMName>CALL</OEMName>
<Model>562A</Model>
<EquipmentID>1W2772G</EquipmentID>
<SerialNumber>1TT772GPTE</SerialNumber>
<PIN>1MM772GPTE</PIN>
</EquipHead>
<Location datetime="2018-05-30T07:00:17.000Z">
<Latitude>22.809278</Latitude>
<Longitude>-45.316417</Longitude>
</Location>
<IdleHours datetime="2018-05-24T20:37:03.000Z">
<Hour>457.10</Hour>
</IdleHours>
<OperatingHours datetime="2018-05-30T18:25:18.000Z">
<Hour>26.35</Hour>
</OperatingHours>
<Distance datetime="2018-05-23T13:26:37.000Z">
<Units>kilometre</Units>
<OdometerV>5075.6997</OdometerV>
</Distance>
<FuelUsed datetime="2018-05-24T20:37:03.000Z">
<FuelUnits>litre</FuelUnits>
<FuelConsumed>2548</FuelConsumed>
</FuelUsed>
</Equip>
</Group>
答案 0 :(得分:1)
考虑未记录的内部变量XML:::xmlAttrsToDataFrame
和cbind
,并记录在案,xmlToDataFrame
:
library(XML)
doc <- xmlParse('/path/to/input.xml')
namespaces <- c(n="http://standards.is.com/is/151/-1")
xmldataframe <- cbind(xmlToDataFrame(doc, nodes=getNodeSet(doc, "//n:OperatingHours", namespaces)),
XML:::xmlAttrsToDataFrame(getNodeSet(doc, "//n:OperatingHours", namespaces)))
xmldataframe
# Hour datetime
# 1 198.80 2012-03-01T17:42:37.000Z
# 2 72.35 2018-05-30T19:22:48.000Z
# 3 26.35 2018-05-30T18:25:18.000Z