使用perl并逐行读取文件,我需要删除两个特定单词之间包含的所有文本(比如说“dog”和“cat”),但我不知道怎么做两个单词之间的各种线条。 Iim尝试使用“s”修饰符,这意味着点(。)可以解释为新行,但它不起作用:
use warnings;
use strict;
my $filename = shift;
open F, $filename or die "Usa: $0 FILENAME\n";
while(<F>) {
s/dog.*?cat//s;
print;
}
close F;
答案 0 :(得分:1)
您正逐行读取文件,然后替换。如果您想同时使用整个文本,请使用
将输入记录分隔符设置为undeflocal $/;
然后,当您执行&lt; F&gt;时,您将获得整个文件内容,并且替换应该有效。
答案 1 :(得分:1)
while (<F>) {
my $n = s/dog.*//s .. s/.*?cat//;
$n ||= 0;
print if $n <= 1 or $n =~ /E/;
}
答案 2 :(得分:0)
上面的答案是正确的。我自己刚刚处理过这个问题。你可以尝试:
use strict;
my $filename = shift;
open F, $filename or die "Usa: $0 FILENAME\n";
my $buffer;
{
local $/;
$buffer = <F>;
$buffer =~ s/dog.*?cat//s;
}
print $buffer;
请注意,这可能会产生您不想要的副作用。考虑输入:
dog foo dog bar cat
你想要'foo'包含在未打印的内容中吗?默认情况下,正则表达式是贪婪的,将删除'foo'...这可能是你想要的,也可能不是。
CPAN模块Regexp::Common::balanced可以帮助您找出处理这类边缘情况的正确方法。
答案 3 :(得分:0)
通过本地化$ /来填充文件将是您最简单的解决方案。但是,如果您想逐行处理,那么您只需要跟踪$状态变量
use strict;
use warnings;
use autodie;
my $filename = shift;
#open my $fh, '<', $filename;
my $state = 0;
while(<DATA>) {
if ($state == 0 && s/(.*?)dog//) {
print $1;
$state = 1;
}
if ($state == 1 && s/.*?cat//) {
$state = 2;
# If you want to handle more than one dog/cat pair, use below code
# $state = 0;
# redo;
}
if ($state != 1) {
print;
}
}
#close $fh;
__DATA__
1 hello world
2 more lines
3 this cat is ignored
4 and yet more
5 this has <dog ... yep, it really does
6 stuff to delete
7 this has cat>, cuz cats rock
8 Filler line
9 more <dogs are ignored.
10 more cat>s
11 more filler
12 yet more filler
13 More <dogs and cat>s and stuff
14 more filler
15 more filler
16 more <dogs and cat>s and <dogs and cat>s, see.
17 ending stuff
输出
1 hello world
2 more lines
3 this cat is ignored
4 and yet more
5 this has <>, cuz cats rock
8 Filler line
9 more <dogs are ignored.
10 more cat>s
11 more filler
12 yet more filler
13 More <dogs and cat>s and stuff
14 more filler
15 more filler
16 more <dogs and cat>s and <dogs and cat>s, see.
17 ending stuff
如果您取消注释这两行,以便过滤掉多个狗/猫对,那么您将获得以下内容:
1 hello world
2 more lines
3 this cat is ignored
4 and yet more
5 this has <>, cuz cats rock
8 Filler line
9 more <>s
11 more filler
12 yet more filler
13 More <>s and stuff
14 more filler
15 more filler
16 more <>s and <>s, see.
17 ending stuff