使用Grep将多个XML文件中的数据聚合到一个文件中

时间:2018-04-16 10:53:22

标签: xml xslt xpath awk grep

我有多个xml文件,我想从中获取值,并逐行将它们写在单独的csv / text / excel文件中。

我尝试了下面的grep命令:

grep -e \<r p\>  Inputfilename | sed 's/<[^>]*>//g' | awk '{ print $2 }' | awk '{ for (i=1;i<=NF;i++ ) printf $i " " }' >> Output.txt 

但是此命令将所有值写入一行。 我是新手,所以我不确定如何将这些值排成一行。

以下是输入文件示例:

<measType p="1">Used NonHeap Mem MB</measType>
            <measType p="2">Online CPU Usage %</measType>
            <measType p="3">Used Physical Mem %</measType>
            <measType p="4">Used Physical Mem MB</measType>
            <measType p="5">Used Heap Mem %</measType>
            <measType p="6">Used Tenured Gen MB</measType>
            <measType p="7">Used Survivor Space MB</measType>
            <measType p="8">Used NonHeap Mem %</measType>
            <measType p="9">Total CPU Usage %</measType>
            <measType p="10">Used Eden Space MB</measType>
            <measType p="11">Used Heap Mem MB</measType>
            <measValue measObjLdn="">
                <r p="1">48.361183166503906</r>
                <r p="2">0.008397036232054234</r>
                <r p="3">4.5677</r>
                <r p="4">34425.0</r>
                <r p="5">68.05066879841843</r>
                <r p="6">410.58392333984375</r>
                <r p="7">22.375</r>
                <r p="8">93.67783664213832</r>
                <r p="9">0.028054807427357</r>
                <r p="10">169.9580841064453</r>
                <r p="11">602.8837356567383</r>
            </measValue>

对于此输入,我从上面的命令获得的输出是:

48.361183166503906 0.008397036232054234 4.5677 34425.0 68.05066879841843 410.58392333984375 22.375 93.67783664213832 0.028054807427357 169.9580841064453 602.883735656738

当我为多个文件运行此命令时,它产生如下内容:

48.361183166503906 0.008397036232054234 4.5677 34425.0 68.05066879841843 410.58392333984375 22.375 93.67783664213832 0.028054807427357 169.9580841064453 602.8837356567383  48.377540588378906 0.008116667158901691 5.73992 33834.0 10.798112742450364 42.10478973388672 22.375 93.70952172083081 0.021666161122907 31.18431854248047 95.66410827636719  58.068382263183594 3.406280755996704 6.46515 34405.0 56.60833858273274 903.4959945678711 16.5166015625 94.90236120642875 7.068469741716277 39.66230773925781 959.4206771850586

但我希望命令结果为:

48.361183166503906 0.008397036232054234 4.5677 34425.0 68.05066879841843 410.58392333984375 22.375 93.67783664213832 0.028054807427357 169.9580841064453 602.8837356567383  
48.377540588378906 0.008116667158901691 5.73992 33834.0 10.798112742450364 42.10478973388672 22.375 93.70952172083081 0.021666161122907 31.18431854248047 95.66410827636719  
58.068382263183594 3.406280755996704 6.46515 34405.0 56.60833858273274 903.4959945678711 16.5166015625 94.90236120642875 7.068469741716277 39.66230773925781 959.4206771850586  

请帮助我。提前谢谢!

2 个答案:

答案 0 :(得分:1)

一种选择是将xmlstarlet's tr command与XSLT样式表一起使用。

示例...

XSLT 1.0 (example.xsl)

<li class="google"><a href="{% url "social:begin" "google-oauth2" %}">Login with Google</a></li>

xmlstarlet命令行

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="/*">
    <xsl:for-each select=".//r">
      <xsl:sort select="@p" data-type="number"/>
      <xsl:if test="position() > 1">
        <xsl:text> </xsl:text>
      </xsl:if>
      <xsl:value-of select="normalize-space()"/>
    </xsl:for-each>
    <xsl:text>&#xA;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

输出(使用两个输入文件;您提供的文件和带有&#34的副本; b&#34;添加到每个xml tr example.xsl *.xml 值)

r

您还可以使用xmlstarlet's sel command获得非常相似的内容(目前我在输出的开头有一个额外的换行符):

48.361183166503906 0.008397036232054234 4.5677 34425.0 68.05066879841843 410.58392333984375 22.375 93.67783664213832 0.028054807427357 169.9580841064453 602.8837356567383
48.361183166503906b 0.008397036232054234b 4.5677b 34425.0b 68.05066879841843b 410.58392333984375b 22.375b 93.67783664213832b 0.028054807427357b 169.9580841064453b 602.8837356567383b

答案 1 :(得分:0)

由于awk命令中的printf,它或许可以将所有内容写在一行中。默认情况下,printf不会添加换行符。尝试使用print或明确添加“\ n”。

或者,如果您的measValue选项卡总是包含11个节点,请考虑使用:

$ grep -e \<r p\>  Inputfilename | sed 's/<[^>]*>//g' | awk '{print $2}' | paste - - - - - - - - - - -