Pig使用XPath解析XML及其后的多出现标签

时间:2018-11-01 16:06:10

标签: xml xpath apache-pig

需要帮助,以使用XPath和XPathAll读取嵌套/多次出现的xml标签 示例:

<ProductInfo>
 <code>100</code>
 <entryInfo> 
    <statusCode>10</statusCode>
    <startTime>11</startTime>
    <endTime>12</endTime>
 </entryInfo> 
 <entryInfo>
    <statusCode>20</statusCode>
    <startTime>21</startTime>
    <endTime>22</endTime>
    <strengthValue>23</strengthValue>
    <strengthUnits>24</strengthUnits>
 </entryInfo>
 <entryInfo>
    <statusCode>30</statusCode>        
    <endTime>32</endTime>
    <strengthValue>33</strengthValue>
    <strengthUnits>34</strengthUnits>
    <extra>35</extra>
 </entryInfo>  
</ProductInfo>

预期输出: 100,10,11,12 ,,,, 20,21,22,23,24,,30,,32,33,34,35

在读取第一次出现的entryInfo和子标记时,XPath将以第二次$ 0查找丢失的标记

XPathAll(x,’ProductInfo/entryInfo/statusCode’).$0,
XPathAll(x,’ProductInfo/entryInfo/startTime’).$0,
XPathAll(x,’ProductInfo/entryInfo/endTime’).$0,
XPathAll(x,’ProductInfo/entryInfo/strengthValue’).$0,
XPathAll(x,’ProductInfo/entryInfo/strengthUnits’).$0,
XPathAll(x,’ProductInfo/entryInfo/extra’).$0

输出; 10,11,12,23,24,36

0 个答案:

没有答案