需要使用另一个文件中遇到的单词数(例如Word2
和Word1: 35 [25, 50, 300, ...]
Word2: 15 [10, 25, 65, ...]
)制作一个文件,并指定这些单词以这种格式出现的行:
store_gdt(dtr)
答案 0 :(得分:0)
不幸的是,您的问题缺少示例输入文件,该文件演示了您需要处理的所有事情以及基于这些文件的预期输出,因此,我只是在补充一些内容。
提供文件
wordlist.txt
:
cat
dog
fish
horse
和input.txt
:
There are three fish.
Two red fish.
One blue fish and a brown dog.
There are no matching words on this line.
Also there is no cat, only the dog. Oh, there is a white dog too.
There are doggies.
此perl脚本将打印匹配的单词及其行,包括每行一个单词的多个匹配项:
#!/usr/bin/env perl
use warnings;
use strict;
use autodie;
use feature qw/say/;
use English;
my %words;
open my $wordlist, "<", $ARGV[0];
while (<$wordlist>) {
chomp;
$words{$_} = [];
}
open my $text, "<", $ARGV[1];
while (<$text>) {
while (my ($word, $positions) = each %words) {
while (m/\b\Q$word\E\b/g) { # Match all occurrences of the word by itself
push @$positions, $NR;
}
}
}
$OFS = ' ';
for my $word (sort keys %words) {
my $positions = $words{$word};
say "$word:", scalar(@$positions), join(',', @$positions);
}
示例:
$ perl words.pl wordlist.txt input.txt
cat: 1 5
dog: 3 3,5,5
fish: 3 1,2,3
horse: 0