有两个文件file1和file2。他们的内容是:
file1 - 输入
Line1
Line2
Line3
Line4
file2 - 输入
<head>
<intro> This is an introduction </intro>
<line> this is a line1 </line>
</head>
<head>
<intro> This is another intro </intro>
<line> this is a line2 </intro>
</head>
<head>
<intro> This is an introduction </intro>
<line> this is a line3 </line>
</head>
<head>
<intro> This is another intro </intro>
<line> this is a line4 </intro>
</head>
想要读取file1并用Line1,Line2,Line3,Line4替换file2中的行标记值(请参阅输出)。这是最简单的方法(sed,awk,grep,perl,python ......)吗?
输出
<head>
<intro> This is an introduction </intro>
<line> Line1 </line>
</head>
<head>
<intro> This is another intro </intro>
<line> Line2 </intro>
</head>
<head>
<intro> This is an introduction </intro>
<line> Line3 </line>
</head>
<head>
<intro> This is another intro </intro>
<line> Line4 </intro>
</head>
如果您认为这是重复的,请将副本链接起来。我试图通过看似相似的解决方案,但没有找到我。
修改 如果有人想要追加/连接而不是替换,可以在@cdarke的python2代码中轻松修改 markline 表达式,如下所示并使用。
markline = re.sub(r'</line>$',''+subt+'</line>',markline)
答案 0 :(得分:3)
使用GNU sed和bash的进程替换:
sed -e '/<line>[^<]*<\/[^>]*>/{R '<(sed 's|.*| <line> & </line>|' file1) -e 'd;}' file2
输出:
<head>
<intro> This is an introduction </intro>
<line> Line1 </line>
</head>
<head>
<intro> This is another intro </intro>
<line> Line2 </line>
</head>
<head>
<intro> This is an introduction </intro>
<line> Line3 </line>
</head>
<head>
<intro> This is another intro </intro>
<line> Line4 </line>
</head>
答案 1 :(得分:2)
最简单的方法可能就是您熟悉的方法。如果您了解这些语言,它在Perl和Python(以及Ruby和Lua)中很容易。 '轻松'是主观的。
(编辑示例以添加空格)
这是Python 2版本:
import re
lines = open('file1').readlines()
with open('file2') as fh:
for markline in fh:
if '<line>' in markline:
subt = lines.pop(0).rstrip()
markline = re.sub(r'<line>.*</line>', '<line> ' + subt + ' </line>',
markline)
print markline,
这是一个Perl版本:
use strict;
use warnings;
open(my $fh1, 'file1') or die "Unable to open file1 for read: $!";
my @lines = <$fh1>;
chomp(@lines);
close($fh1);
open(my $fh2, 'file2') or die "Unable to open file2 for read: $!";
while (<$fh2>) {
s/<line>.*<\/line>/'<line> ' . shift(@lines) . ' <\/line>'/e;
print
}
close($fh2);
我在输入数据中假设了拼写错误。
我展示的代码有效,但不灵活。所有这些语言都有几个XML解析器,实际上您应该学习其中一种语言和XML解析器。