Question

我有一些单词，我有兴趣根据两个或多个单词的出现找到一个句子的重复：

示例：

我想在句子中发现'男孩'或'男孩'和'女孩'或'女孩'，这样我就可以拥有这些套装:(男孩和女孩），（男孩和女孩），（女孩和男孩））和（男孩和女孩）。

句子：

男孩正带着女孩上学，因为男孩喜欢女孩这么多。

句子代表：

WORD1带着WORD2上学，因为WORD3非常喜欢WORD4。

我怎么能有四（4）种不同形式的句子，使它看起来像这样：

输出：

The WORD1 is going to school with a WORD2, because the WORD like the WORD so much.
The WORD1 is going to school with a WORD, because the WORD like the WORD4 so much.
The WORD is going to school with a WORD2, because the WORD3 like the WORD so much.
The WORD is going to school with a WORD, because the WORD3 like the WORD4 so much.

NB。

单词的数量可以是2或更多的动态;在这个例子中，我有4个单词。

Answer 1

使用反向引用：

if ($sentence =~ m/\b(\w+)\b.*\b\1/) {
  print "repeated use of the word $1\n";
}

Answer 2

虽然它仍然需要大量改进，但以下内容应该让您开始并指出正确的方向：

#!/usr/bin/env perl

use strict;
use warnings;

use Algorithm::Permute;
use Lingua::EN::Tagger;
use Lingua::EN::Inflect::Number qw(to_S);

my $text = q{The boy is going to school with a girl, because the boys
like the girls so much.};

my $tagger = Lingua::EN::Tagger->new;

my $tagged_text = $tagger->add_tags( $text );

my %nouns = $tagger->get_nouns( $tagged_text );

my %normalized;
for my $noun (keys %nouns) {
    $normalized{ to_S($noun)}{ $noun } = undef;
}

for my $nouns (values %normalized) {
    my $p = Algorithm::Permute->new([ keys %$nouns ]);

    while (my @tuple = $p->next) {
        print join(', ', @tuple), "\n";
    }
}

输出：

boy, boys
boys, boy
school
girl, girls
girls, girl

如何检测和复制Perl中句子中的单词共现？

2 个答案:

输出：