将节点的属性值与子节点相关联

时间:2015-01-05 20:05:33

标签: xml macos xpath

我想将节点的属性值与多个XML文件的子节点相关联。

例如,我的XML文件中有这种结构:

    <desc id="butwba10.1.wc.01" dbi="BUTWBA10.1.1.WC">
        <objtitle>
            <title type="transcribed">Title-Page Design for <hi rend="i">The Grave</hi>
            </title>
            <title type="alt">The Skeleton Re-Animated</title>
            <title type="alt">A Characteristic Frontispiece</title>, <objid>
                <objnumber code="A1">object 1 </objnumber>
            </objid>
        </objtitle>
        <physdesc desclevel="brief">
            <objsize>33.2 x 26.6 cm.</objsize>
            <objnote>
                    adfadfa
            </objnote>
            <windowsize width="600" height="700"/>
        </physdesc>
        <related objectid="bb435.1.comdes.02"/>
        <related objectid="but614r.1.penc.01"/>
        <related objectid="but611.1.wc.01"/>
        <related objectid="but612.1.wd.01"/>
        <related objectid="bb515.1.comb.12"/>
    </desc>

我想提取desc id(butwba10.1.wc.01)并将其与“相关的objectids”相关联(bb335.1.comdes.02,但是614r.1.penc.01,... )并将关联插入到如下文件中:

butwba10.1.wc.01 related="bb335.1.comdes.02, but614.r.1.penc.01, ..."

我正在使用bash。我可以做点像

xpath BUTWBA10.1.xml //bad/objdesc/desc//related > test

但是我没有与每组相关元素相关联的desc id。

---- ----更新

@sputnik,我在bash文件中有这个:

`

#!/bin/bash


for f in *.xml
do

id=$(xml sel -t -v '//bad/objdesc/desc/@id' $f)
arr=( $(xml sel -t -v '//bad/objdesc/desc/related/@objectid' $f) )
cat<<EOF >> output.txt
$id related="$(printf '%s\n' "${arr[@]}" | paste -sd ', ')"
EOF

done

`

但输出显示:

butwba10.1.wc.01
butwba10.1.wc.02
butwba10.1.wc.03
butwba10.1.wc.04
butwba10.1.wc.05
butwba10.1.wc.06
butwba10.1.wc.07
butwba10.1.wc.08
butwba10.1.wc.09
butwba10.1.wc.10
butwba10.1.wc.11
butwba10.1.wc.12
butwba10.1.wc.13
butwba10.1.wc.14
butwba10.1.wc.15
butwba10.1.wc.16
butwba10.1.wc.17
butwba10.1.wc.18
butwba10.1.wc.19
butwba10.1.wc.20 related=""

------更新---------

@sputnick我得到了这个输出。所有“相关”元素都与xml文件的最后一个desc id混为一谈:

bb421.1.spb.01
bb421.1.spb.02
bb421.1.spb.03
bb421.1.spb.04
bb421.1.spb.05
bb421.1.spb.06
bb421.1.spb.07
bb421.1.spb.08
bb421.1.spb.09
bb421.1.spb.10
bb421.1.spb.11
bb421.1.spb.12
bb421.1.spb.13
bb421.1.spb.14
bb421.1.spb.15
bb421.1.spb.16
bb421.1.spb.17
bb421.1.spb.18
bb421.1.spb.19
bb421.1.spb.20
bb421.1.spb.21
bb421.1.spb.22
bb421.1.spb.23 related="but550.1.wc.01,but551.1.wc.01,but557.1.penc.05,but557.1.penc.06,but557.1.penc.07,but557.1.penc.30,but557.1.penc.31,but550.1.wc.02,but551.1.wc.02,but557.1.penc.08,but550.1.wc.03,but551.1.wc.03,but557.1.penc.09,but550.1.wc.04,but551.1.wc.04,but557.1.penc.10,but550.1.wc.05,but551.1.wc.05,but557.1.penc.11,but550.1.wc.06,but551.1.wc.06,but557.1.penc.04,but557.1.penc.12,but550.1.wc.07,but551.1.wc.07,but557.1.penc.13,but550.1.wc.08,but551.1.wc.08,but557.1.penc.04,but557.1.penc.14,but550.1.wc.09,but551.1.wc.09,but557.1.penc.15,but550.1.wc.10,but551.1.wc.10,but557.1.penc.16,but550.1.wc.11,but551.1.wc.11,but557.1.penc.17,but550.1.wc.12,but551.1.wc.12,but557.1.penc.18,but461.1.wc.01,but550.1.wc.13,but551.1.wc.13,but557.1.penc.19,but550.1.wc.14,but551.1.wc.14,but557.1.penc.20,but550.1.wc.15,but551.1.wc.15,but557.1.penc.21,but550.1.wc.16,but551.1.wc.16,but557.1.penc.22,but550.1.wc.17,but551.1.wc.17,but557.1.penc.23,but550.1.wc.18,but551.1.wc.18,but557.1.penc.02,but557.1.penc.24,but550.1.wc.19,but551.1.wc.19,but557.1.penc.26,but557.1.penc.28,but550.1.wc.20,but551.1.wc.20,but557.1.penc.25,but557.1.penc.29,but394.1.pt.01,but551.1.wc.21,but550.1.wc.21,but557.1.penc.27,but557.1.penc.30,but557.1.penc.31"

1 个答案:

答案 0 :(得分:0)

示例输出的完整实现:​​

#!/bin/bash

for x; do
    id=$(xmlstarlet sel -t -v '/desc[1]/@id' "$x")
    arr=( $(xmlstarlet sel -t -v '/desc[1]/related/@objectid' "$x") )
    cat<<EOF >> new_file
$id related="$(perl -e 'print join ",", @ARGV' "${arr[@]}")"
EOF
done

用法:

chmod +x script.sh
./script.sh file1.xml file2.xml file3.xml ...
cat new_file