我有两个文件,一个带文本,另一个带键/哈希值。我想用哈希值替换密钥的出现。以下代码执行此操作,我想知道的是,是否有比我正在使用的foreach循环更好的方法。
全部谢谢
编辑:我知道使用
有点奇怪s/\n//;
s/\r//;
而不是chomp,但这适用于具有混合行尾字符的文件(在Windows和Linux上都编辑过)和chomp(我认为)没有。
带密钥/哈希值的文件(hash.tsv):
strict $tr|ct
warnings w@rn|ng5
here h3r3
带文字的文件(doc.txt):
Do you like use warnings and strict?
I do not like use warnings and strict.
Do you like them here or there?
I do not like them here or there?
I do not like them anywhere.
I do not like use warnings and strict.
I will not obey your good coding practice edict.
perl脚本:
#!/usr/bin/perl
use strict;
use warnings;
open (fh_hash, "<", "hash.tsv") or die "could not open file $!";
my %hash =();
while (<fh_hash>)
{
s/\n//;
s/\r//;
my @tmp_hash = split(/\t/);
$hash{ @tmp_hash[0] } = @tmp_hash[1];
}
close (fh_hash);
open (fh_in, "<", "doc.txt") or die "could not open file $!";
open (fh_out, ">", "doc.out") or die "could not open file $!";
while (<fh_in>)
{
foreach my $key ( keys %hash )
{
s/$key/$hash{$key}/g;
}
print fh_out;
}
close (fh_in);
close (fh_out);
答案 0 :(得分:2)
您可以将整个文件读入变量a,为每个key-val一次性替换所有出现的事件。
类似的东西:
use strict;
use warnings;
use YAML;
use File::Slurp;
my $href = YAML::LoadFile("hash.yaml");
my $text = read_file("text.txt");
foreach (keys %$href) {
$text =~ s/$_/$href->{$_}/g;
}
open (my $fh_out, ">", "doc.out") or die "could not open file $!";
print $fh_out $text;
close $fh_out;
产生
Do you like use w@rn|ng5 and $tr|ct?
I do not like use w@rn|ng5 and $tr|ct.
Do you like them h3r3 or th3r3?
I do not like them h3r3 or th3r3?
I do not like them anywh3r3.
I do not like use w@rn|ng5 and $tr|ct.
I will not obey your good coding practice edict.
为了缩短代码,我使用了YAML并将输入文件替换为:
strict: $tr|ct
warnings: w@rn|ng5
here: h3r3
并使用File :: Slurp将整个文件读入变量。当然,您可以在没有File :: Slurp的情况下“啜饮”文件,例如:
my $text;
{
local($/); #or undef $/;
open(my $fh, "<", $file ) or die "problem $!\n";
$text = <$fh>;
close $fh;
}
答案 1 :(得分:2)
的一个问题
for my $key (keys %hash) {
s/$key/$hash{$key}/g;
}
是否无法正确处理
foo => bar
bar => foo
而不是交换,你最终得到所有“foo”或所有“bar”,你甚至无法控制哪个。
# Do once, not once per line
my $pat = join '|', map quotemeta, keys %hash;
s/($pat)/$hash{$1}/g;
您可能还想处理
foo => bar
food => baz
花费最长而不是以“bard”结尾。
# Do once, not once per line
my $pat =
join '|',
map quotemeta,
sort { length($b) <=> length($a) }
keys %hash;
s/($pat)/$hash{$1}/g;