Question

我有一个包含大量数据库表定义的大型xml文件，如下所示：

table name="dbname.tablename" lots of text here>

我想替换每个匹配行中的结束括号（并非所有行都以table name=""开头），以便保留原始行，但在slonyId="number"之前附加> 。为了使事情变得更复杂，我希望从0开始递增slonyId数，这样如果我有1000个表定义，第一个看起来像：

table name="dbname.tablename" lots of text here slonyid="0">

最后一个看起来像：

table name="dbname.tablename" lots of text here slonyId="999">

解决此问题的最佳方法是什么？

提前致谢！

Answer 1

从JS添加解决方案：

awk -F'>' '/table name/{$NF="slonyid="q x++ q FS}1' q='"' inputFile

试试这个：

awk -F'>' '/table name/{print $(NF-1)" slonyid""=""\""NR-1"\""">"}' inputFile

添加测试：

$ cat temp.txt
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>
table name="dbname.tablename" lots of text here>


$ awk -F'>' '/table name/{print $(NF-1)" slonyid""=""\""NR-1"\""">"}' temp.txt
table name="dbname.tablename" lots of text here slonyid="0">
table name="dbname.tablename" lots of text here slonyid="1">
table name="dbname.tablename" lots of text here slonyid="2">
table name="dbname.tablename" lots of text here slonyid="3">
table name="dbname.tablename" lots of text here slonyid="4">
table name="dbname.tablename" lots of text here slonyid="5">
table name="dbname.tablename" lots of text here slonyid="6">
table name="dbname.tablename" lots of text here slonyid="7">
table name="dbname.tablename" lots of text here slonyid="8">
table name="dbname.tablename" lots of text here slonyid="9">
table name="dbname.tablename" lots of text here slonyid="10">
table name="dbname.tablename" lots of text here slonyid="11">
table name="dbname.tablename" lots of text here slonyid="12">
table name="dbname.tablename" lots of text here slonyid="13">
table name="dbname.tablename" lots of text here slonyid="14">

Answer 2

GNU代码sed：

sed = file|sed 'N;s/\n/\t/;/\S\+\s\+table name/!d'|sed =|sed 'N;s/\n/\t/;s/\(\S\+\)\s\+\([^>]\+\)>/\2 slonyid="\1">/;s#\(\S\+\)\s\+\(.*\)#\1 s/.*/\2/#'|sed -f - file

具有4个管道的纯sed溶液。

$cat file
table name="dbname.tablename" lots of text AAA here>
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
index name="dbname.tablename" lots of text ZZZ here>
table name="dbname.tablename" lots of text BBB here>
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
table name="dbname.tablename" lots of text CCC here>
index name="dbname.tablename" lots of text XXX here>
table name="dbname.tablename" lots of text DDD here>
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
index name="dbname.tablename" lots of text ZZZ here>
table name="dbname.tablename" lots of text EEE here>
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
table name="dbname.tablename" lots of text FFF here>
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
index name="dbname.tablename" lots of text ZZZ here>

$sed = file|sed 'N;s/\n/\t/;/\S\+\s\+table name/!d'|sed =|sed 'N;s/\n/\t/;s/\(\S\+\)\s\+\([^>]\+\)>/\2 slonyid="\1">/;s#\(\S\+\)\s\+\(.*\)#\1 s/.*/\2/#'|sed -f - file
table name="dbname.tablename" lots of text AAA here slonyid="1">
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
index name="dbname.tablename" lots of text ZZZ here>
table name="dbname.tablename" lots of text BBB here slonyid="2">
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
table name="dbname.tablename" lots of text CCC here slonyid="3">
index name="dbname.tablename" lots of text XXX here>
table name="dbname.tablename" lots of text DDD here slonyid="4">
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
index name="dbname.tablename" lots of text ZZZ here>
table name="dbname.tablename" lots of text EEE here slonyid="5">
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
table name="dbname.tablename" lots of text FFF here slonyid="6">
index name="dbname.tablename" lots of text XXX here>
index name="dbname.tablename" lots of text YYY here>
index name="dbname.tablename" lots of text ZZZ here>

Answer 3

如果我正确理解你的问题，这个perl one-liner将会起作用：

perl -pi.bak -e 'BEGIN {$count=0}; if (/^table name=/) { s/^(table name=.*)>$/$1 slonyId="$count">/; $count++}' inputFile.xml

这些选项告诉perl循环遍历给定的文件名，并创建一个名为“orig_filname.bak”的备份：

perl -pi.bak -e

这会初始化$count变量：

BEGIN {$count=0};

此增量计数并执行您要求的替换：

if (/^table name=/) { s/^(table name=.*)>$/$1 slonyId="$count">/; $count++}

然后在最后提供文件名列表：

inputFile.xml

这不是一个非常强大的解决方案，如果您的文件中的任何行与上面给出的描述不符，可能会中断，但它应该适用于您的问题。

我认为我太新了，无法直接评论其他解决方案，但在我的测试中，FDinoff的解决方案会将slonyId添加到如下所示的行：

not a table name="dbname.tablename" lots of text here>

Amit的解决方案会将slonyId添加到每一行，而不仅仅是以“table name”开头的行。

Answer 4

vim解决方案

使用global在一行中查找table name=。并使用>替换该行上的slonyId="number">您可以使用以下两行来完成此操作。

:let i = 0
:g/^table name=/s/>/\='slonyId="' . i . '"' . submatch(0)/ | let i=i+1

第一行将i初始化为0.每次匹配时，替换都会获取该列表的第一个元素，并使用字符串连接生成正确的字符串。然后在替换i之后递增。这样下一个替代品就会获得序列中的下一个数字。

Answer 5

您永远不应该使用逐行字符串操作来编辑XML文件。 XML的结构不是那样的。始终使用适当的XML解析器，例如Perl的XML::LibXML：

#!/usr/bin/env perl

use strict;
use warnings;
use XML::LibXML;

my $xml = XML::LibXML->new->parse_file('/path/to/input.xml');

my $i = 0;
$_->setAttribute('slonyId', $i++) for $xml->findnodes('//table');

$xml->toFile('/path/to/output.xml')

通过Regex工具修改具有特定模式的XML标记

5 个答案: