XMLStarlet:每个项目打印一行,同时使用来自父元素

时间:2017-02-15 22:38:24

标签: xml bash parsing xpath xmlstarlet

我有以这种方式格式化的XML数据:

<XML>
    <Waveforms Time="01/01/2009 3:00:02 AM">
        <WaveformData Channel="I">1, 2, 3, 4, 5, 6 </WaveformData>
        <WaveformData Channel="II">9, 8, 7, 6, 5, 4 </WaveformData>
    </Waveforms>
    <Waveforms Time="01/01/2009 3:00:04 AM">
        <WaveformData Channel="I">1, 2, 3, 4, 5, 6 </WaveformData>
        <WaveformData Channel="II">9, 8, 7, 6, 5, 4 </WaveformData>
    </Waveforms>
</XML>

我正在尝试使用xmlstarlet将此数据解析为文本文件(逗号分隔)。所需的输出如下所示:

Time Attribute, Channel Attribute, Data
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4

我能想到的最好的是:

 xmlstarlet sel -T -t -m //XML/Waveforms -v @Time -o "," -m Waves -v WaveformData/@Channel -o "," -v WaveformData -o "," -b -n testwave2.xml > testwave.txt

结果如下:

 01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6, II, 9, 8, 7, 6, 5, 4
 01/01/2009 3:00:04 AM, I, 1, 2, 3, 4, 5, 6, II, 9, 8, 7, 6, 5, 4

如果我想从其父元素中包含时间属性,那么每个Waveforms如何打印一行,而不是如何为每个WaveformData打印一行。可以这样做吗?或者,我应该在之后做些工作并做一些切片和粘贴以在后端修复它吗?

2 个答案:

答案 0 :(得分:4)

搜索WaveformData - 因为它是你想要的每一行 - 并且只是在树中向上遍历以找到你的时间元素。

$ xmlstarlet sel -T -t -m /XML/Waveforms/WaveformData \
     -v ../@Time -o "," \
     -v @Channel -o "," \
     -v . -n <in.xml
01/01/2009 3:00:02 AM,I,1, 2, 3, 4, 5, 6 
01/01/2009 3:00:02 AM,II,9, 8, 7, 6, 5, 4 
01/01/2009 3:00:04 AM,I,1, 2, 3, 4, 5, 6 
01/01/2009 3:00:04 AM,II,9, 8, 7, 6, 5, 4 

或者,如果您知道每个Waveforms将只有两个WaveformData子项,您可以执行以下操作:

$ xmlstarlet sel -T -t -m /XML/Waveforms \
    -v ./@Time -o ",I,"  -v './WaveformData[@Channel="I"]' -n \
    -v ./@Time -o ",II," -v './WaveformData[@Channel="II"]' -n <in.xml
01/01/2009 3:00:02 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM,II,9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM,I,1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM,II,9, 8, 7, 6, 5, 4

答案 1 :(得分:2)

为了略微改变Charles Duffy的答案,您可以使用concat()函数来简化它,并使用初始模板来提供CSV标题:

$ xmlstarlet sel \
    -t -o 'Time Attribute, Channel Attribute, Data' -n \
    -t -m '//Waveforms/WaveformData' \
       -v 'concat(../@Time, ", ", @Channel, ", ", text())' -n \
  waveforms.xml
Time Attribute, Channel Attribute, Data
01/01/2009 3:00:02 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:02 AM, II, 9, 8, 7, 6, 5, 4
01/01/2009 3:00:04 AM, I, 1, 2, 3, 4, 5, 6
01/01/2009 3:00:04 AM, II, 9, 8, 7, 6, 5, 4