如何在R中读取带有初始标签的XML文件

时间:2019-05-10 12:24:14

标签: r xml readxml parsexml

我有几个XML文件都缺少初始标签。例如,这是正确格式的文件:-

<?xml version="1.0"?>
<UDI>
<Test_Equipment_Number>3300061-01</Test_Equipment_Number>
<Test_SW_Number>3300062</Test_SW_Number>
<Test_SW_Version>2.1</Test_SW_Version>
<GTIN>(01)00884838088597</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190322</Date_of_Mfg>
<Device_SN>(21)1160001242</Device_SN>
<Material_Number>(96)300001287651</Material_Number>
<PCBA_WO_and_SN>00190311-0001242</PCBA_WO_and_SN>
<FW_Version>06</FW_Version>
<Model>324PHB</Model>
</UDI>

这是缺少初始标签的文件:-

<Test_Equipment_Number>3300011-01</Test_Equipment_Number>
<Test_SW_Number>3300012</Test_SW_Number>
<Test_SW_Version>5.1</Test_SW_Version>
<GTIN>(01)00884838085497</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190411</Date_of_Mfg>
<Device_SN>(21)1120104548</Device_SN>
<Material_Number>(96)300000267981</Material_Number>
<PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
<FW_Version>V01.0001</FW_Version>
<Model>7000PHW</Model>

我如何在R编程语言中读取缺少初始标签的文件?

1 个答案:

答案 0 :(得分:1)

一个选项是通过指定要添加的顶级节点来解析xml片段:

# install.packages('XML')
library(XML)

fragment <- 
'<Test_Equipment_Number>3300011-01</Test_Equipment_Number>
<Test_SW_Number>3300012</Test_SW_Number>
<Test_SW_Version>5.1</Test_SW_Version>
<GTIN>(01)00884838085497</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190411</Date_of_Mfg>
<Device_SN>(21)1120104548</Device_SN>
<Material_Number>(96)300000267981</Material_Number>
<PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
<FW_Version>V01.0001</FW_Version>
<Model>7000PHW</Model>'

XML::parseXMLAndAdd(fragment, top = 'content')
#> <content>
#>   <Test_Equipment_Number>3300011-01</Test_Equipment_Number>
#>   <Test_SW_Number>3300012</Test_SW_Number>
#>   <Test_SW_Version>5.1</Test_SW_Version>
#>   <GTIN>(01)00884838085497</GTIN>
#>   <LOT/>
#>   <Date_of_Mfg>(11)20190411</Date_of_Mfg>
#>   <Device_SN>(21)1120104548</Device_SN>
#>   <Material_Number>(96)300000267981</Material_Number>
#>   <PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
#>   <FW_Version>V01.0001</FW_Version>
#>   <Model>7000PHW</Model>
#> </content>