以编程方式交换两个文本块

时间:2017-06-08 14:49:16

标签: python bash text sed ed

我有一个由多个非常相似的块组成的XML文件。这是两个:

    <Grid Name="EMFieldMany" GridType="Uniform">
        <Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
        <Attribute AttributeType="Scalar" Name="Er" Center="Node">
            <DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/field/Er-0
            </DataItem>
        </Attribute>
        <Geometry GeometryType="VXVYVZ">
            <DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/r
            </DataItem>
            <DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/theta
            </DataItem>
            <DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/z
            </DataItem>
        </Geometry>
    </Grid>
    <Grid Name="EMFieldMany" GridType="Uniform">
        <Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
        <Attribute AttributeType="Scalar" Name="Er" Center="Node">
            <DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/field/Er-1
            </DataItem>
        </Attribute>
        <Geometry GeometryType="VXVYVZ">
            <DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/r
            </DataItem>
            <DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/theta
            </DataItem>
            <DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/z
            </DataItem>
        </Geometry>
    </Grid>

通常,我在一个文件中有数百个类似的<Grid>个对象。现在,我想以编程方式交换每个<DataItem Name="r">对象中<DataItem Name="z"><Grid>块的位置,使<DataItem> s的顺序为z,theta,r。此外,对于每个Dimensions=" x y z "语句,每个包含三个值的Dimensions属性,我希望将该属性重写为Dimensions=" z y x "

我真的不介意用于执行此操作的编程语言。我在Linux工作站上使用bash,python,perl ...所有标准的东西。

编辑:此answer使用sed来匹配文本块,但我不确定如何操作所选块。这个other answer向上和向下交换单行,但我不确定如何推广到文本块,并使其交换块。

3 个答案:

答案 0 :(得分:1)

正如您提到的那样,我建议使用这个perl解决方案,但是使用xml解析器可以更好地解析xml。

#!/usr/bin/perl

# changing input line separator
$/="</Grid>";

while ( $_=<> ) {
    s@(\s*<DataItem Name="r".*?</DataItem>)(\s*<DataItem Name="theta".*?</DataItem>)(\s*<DataItem Name="z".*?</DataItem>)@$3$2$1@s;
    s@<DataItem Dimensions="\K(\d+) (\d+) (\d+) @$3 $2 $1 @;
    print;
}

或单线等效

perl -pe 'BEGIN{$/="</Grid>"}s@(\s*<DataItem Name="r".*?</DataItem>)(\s*<DataItem Name="theta".*?</DataItem>)(\s*<DataItem Name="z".*?</DataItem>)@$3$2$1@s;s@<DataItem Dimensions="\K(\d+) (\d+) (\d+) @$3 $2 $1 @;' <input.txt

答案 1 :(得分:1)

使用 ed

可以轻松完成此操作
g/<DataItem Name="r"/-ka\
/<DataItem Name="z"/\
-kb\
.,/\/DataItem/m'a\
+1,/\/DataItem/m'b

答案 2 :(得分:0)

如果您将XML作为字符串,则可以使用replace执行此操作:

#Swapping ( '\"' the slash escape the " to make it the string caracter)
XML_str = XML_str.replace("\"r","SomethingThatNeverAppearInTheXML")
XML_str = XML_str.replace("\"z","\r")
XML_str = XML_str.replace("SomethingThatNeverAppearInTheXML","\"z")
#Replacing
XML_str = XML_str.replace("x y z","z y x")

输入

XML_str = """    
    <Grid Name="EMFieldMany" GridType="Uniform">
        <Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
        <Attribute AttributeType="Scalar" Name="Er" Center="Node">
            <DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/field/Er-0
            </DataItem>
        </Attribute>
        <Geometry GeometryType="VXVYVZ">
            <DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/r
            </DataItem>
            <DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/theta
            </DataItem>
            <DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/z
            </DataItem>
        </Geometry>
    </Grid>
    <Grid Name="EMFieldMany" GridType="Uniform">
        <Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
        <Attribute AttributeType="Scalar" Name="Er" Center="Node">
            <DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/field/Er-1
            </DataItem>
        </Attribute>
        <Geometry GeometryType="VXVYVZ">
            <DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/r
            </DataItem>
            <DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/theta
            </DataItem>
            <DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
                Field_reflected_time_3D.hdf5:/coordinates/z
            </DataItem>
        </Geometry>
    </Grid>
"""

输出

<Grid Name="EMFieldMany" GridType="Uniform">
    <Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
    <Attribute AttributeType="Scalar" Name="Er" Center="Node">
        <DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/field/Er-0
        </DataItem>
    </Attribute>
    <Geometry GeometryType="VXVYVZ">
        <DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/coordinates/r
        </DataItem>
        <DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/coordinates/theta
        </DataItem>
        <DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/coordinates/z
        </DataItem>
    </Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
    <Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
    <Attribute AttributeType="Scalar" Name="Er" Center="Node">
        <DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/field/Er-1
        </DataItem>
    </Attribute>
    <Geometry GeometryType="VXVYVZ">
        <DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/coordinates/r
        </DataItem>
        <DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/coordinates/theta
        </DataItem>
        <DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
            Field_reflected_time_3D.hdf5:/coordinates/z
        </DataItem>
    </Geometry>
</Grid>