我有一些单词,我有兴趣根据两个或多个单词的出现找到一个句子的重复:
示例:
我想在句子中发现'男孩'或'男孩'和'女孩'或'女孩',这样我就可以拥有这些套装:(男孩和女孩),(男孩和女孩),(女孩和男孩) )和(男孩和女孩)。
句子:
男孩正带着女孩上学,因为男孩喜欢女孩这么多。
句子代表:
WORD1带着WORD2上学,因为WORD3非常喜欢WORD4。
我怎么能有四(4)种不同形式的句子,使它看起来像这样:
输出:
The WORD1 is going to school with a WORD2, because the WORD like the WORD so much.
The WORD1 is going to school with a WORD, because the WORD like the WORD4 so much.
The WORD is going to school with a WORD2, because the WORD3 like the WORD so much.
The WORD is going to school with a WORD, because the WORD3 like the WORD4 so much.
NB。
单词的数量可以是2或更多的动态;在这个例子中,我有4个单词。
答案 0 :(得分:1)
使用反向引用:
if ($sentence =~ m/\b(\w+)\b.*\b\1/) {
print "repeated use of the word $1\n";
}
答案 1 :(得分:1)
虽然它仍然需要大量改进,但以下内容应该让您开始并指出正确的方向:
#!/usr/bin/env perl
use strict;
use warnings;
use Algorithm::Permute;
use Lingua::EN::Tagger;
use Lingua::EN::Inflect::Number qw(to_S);
my $text = q{The boy is going to school with a girl, because the boys
like the girls so much.};
my $tagger = Lingua::EN::Tagger->new;
my $tagged_text = $tagger->add_tags( $text );
my %nouns = $tagger->get_nouns( $tagged_text );
my %normalized;
for my $noun (keys %nouns) {
$normalized{ to_S($noun)}{ $noun } = undef;
}
for my $nouns (values %normalized) {
my $p = Algorithm::Permute->new([ keys %$nouns ]);
while (my @tuple = $p->next) {
print join(', ', @tuple), "\n";
}
}
boy, boys boys, boy school girl, girls girls, girl