bash脚本嵌套的输入文件循环(csv),用于创建具有不同数量属性的xml元素

时间:2016-12-27 14:51:04

标签: xml bash

我是bash脚本的新手,我正在尝试从csv数据文件生成xml文档。我正在使用的样本数据文件包含3个(可能有数百个)唯一元素(表达式列),每个元素有5个属性。我有一个嵌套循环,但在返回第一个循环之前无法获得第二个循环来完成5个属性。

我的示例数据文件是:

source,attribute,type,expression,confidence
profile_sysObjectID,FastIron ,Model,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1,92

profile_sysObjectID,Switch/Router,DeviceType,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1,90

profile_sysObjectID,Brocade,Vendor,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1,90

profile_sysObjectID,Embedded (Brocade),OS,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1,90

profile_sysObjectID,Embedded (Brocade),Version,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1,90

profile_sysObjectID,FastIron WG Switch,Model,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1,92

profile_sysObjectID,Switch,DeviceType,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1,92

profile_sysObjectID,Brocade,Vendor,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1,92

profile_sysObjectID,Embedded (Brocade),OS,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1,92

profile_sysObjectID,Embedded (Brocade),Version,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1,92

profile_sysObjectID,FastIron BB Switch,Model,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.2,92

profile_sysObjectID,Switch,DeviceType,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.2,92

profile_sysObjectID,Brocade,Vendor,^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.2,92

profile_sysObjectID,Embedded (Brocade),OS,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.2,92

profile_sysObjectID,Embedded (Brocade),Version,^.?
1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.2,92

我当前的脚本如下所示:

#!/bin/bash
echo '   input filename...'
read file_in
file_out="sampleout.xml"
echo '<patterns xmlns="urn:lumeta:pattern:6.0" version="2.1.666" user-
provided="true">' > $file_out
while IFS=$',' read -r -a arry
do
  echo '  <pattern>' >> $file_out
  echo '    <source>'${arry[0]}'</source>' >> $file_out
  echo '    <expression>'${arry[3]}'</expression>' >> $file_out
  echo '    <attributes>' >> $file_out
        exp=${arry[3]}
        for exp in ${arry[3]}
        do
  echo '        <attribute> type="'${arry[2]}'" 
confidence="'${arry[4]}'">'${arry[1]}'</attribute>' >> $file_out
        done
  echo '  </pattern>' >> $file_out
done < $file_in
echo  '</patterns>' >> $file_out
exit 0

我从此脚本输出的当前XML仅在返回第一个循环以创建新元素之前,仅循环父元素的第一个属性。
目前的输出是:

<patterns xmlns="urn:lumeta:pattern:6.0" version="2.1.666" user-provided="true">
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1</expression>
    <attributes>
        <attribute> type="Model" confidence="92">FastIron </attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1</expression>
    <attributes>
        <attribute> type="DeviceType" confidence="90">Switch/Router</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1</expression>
    <attributes>
        <attribute> type="Vendor" confidence="90">Brocade</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1</expression>
    <attributes>
        <attribute> type="OS" confidence="90">Embedded (Brocade)</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1</expression>
    <attributes>
        <attribute> type="Version" confidence="90">Embedded (Brocade)</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1</expression>
    <attributes>
        <attribute> type="Model" confidence="92">FastIron WG Switch</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1</expression>
    <attributes>
        <attribute> type="DeviceType" confidence="92">Switch</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1</expression>
    <attributes>
        <attribute> type="Vendor" confidence="92">Brocade</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1</expression>
    <attributes>
        <attribute> type="OS" confidence="92">Embedded (Brocade)</attribute>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1</expression>
    <attributes>
        <attribute> type="Version" confidence="92">Embedded (Brocade)</attribute>
  </pattern>

上面第一个元素的所需输出是:

<patterns xmlns="urn:lumeta:pattern:6.0" version="2.1.666" user-provided="true">
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1</expression>
    <attributes>
        <attribute type="Model" confidence="92">FastIron </attribute>
        <attribute type="DeviceType" confidence="90">Switch/Router</attribute>
        <attribute type="Vendor" confidence="90">Brocade</attribute>
        <attribute type="OS" confidence="90">Embedded (Brocade)</attribute>
        <attribute type="Version" confidence="90">Embedded (Brocade)</attribute>
    </attributes>
  </pattern>
  <pattern>
    <source>profile_sysObjectID</source>
    <expression>^.?1\.3\.6\.1\.4\.1\.1991\.1\.3\.1\.1</expression>
    <attributes>
        <attribute type="Model" confidence="92">FastIron WG Switch</attribute>
        <attribute type="DeviceType" confidence="92">Switch</attribute>
        <attribute type="Vendor" confidence="92">Brocade</attribute>
        <attribute type="OS" confidence="92">Embedded (Brocade)</attribute>
        <attribute type="Version" confidence="92">Embedded (Brocade)</attribute>
    </attributes>
  </pattern>
</patterns>

1 个答案:

答案 0 :(得分:1)

我编辑了您想要的输出,以使其格式良好的XML。 我用很多bash-isms完成了你的作业。请注意,如果不添加一些错误检查代码,我不会在生产中使用它。没有评论,但有一点RTFD一切都会很清楚。

#!/bin/bash

function read_line () {
    declare -g $vars
    IFS=, read $vars || break
}

function tag_open () {
    declare -g indent
    echo "${indent}<${1}${2:+ }$2>"
    indent+='  '
}

function tag_close () {
    indent="${indent%  }"
    echo "${indent}</$1>"
}

function print_tagged_var () {
    echo "$indent<${1}${2:+ }$2>${!1}</$1>"
}

function print_attribute () {
    local props
    printf -v props 'type="%s" confidence="%u"' "$type" "$confidence"
    print_tagged_var attribute "$props"
}

function consume_pattern () {
    read_line
    tag_open pattern
        print_tagged_var source
        print_tagged_var expression
        tag_open attributes
            print_attribute
            for i in {1..4}; do
                read_line
                print_attribute
            done
        tag_close attributes
    tag_close pattern
}


##### MAIN LOOP #####

read vars
vars="${vars//,/ }"
tag_open patterns 'xmlns="urn:lumeta:pattern:6.0" version="2.1.666" user-provided="true"'
    while :; do
        consume_pattern
    done
tag_close patterns