我有一个像这样的数组(它只是一个小概述,但它有2000多个像这样的行):
@list = (
"affaire,chose,question",
"cause,chose,matière",
);
我希望得到这样的结果:
%te = (
affaire => "chose", "question",
chose => "affaire", "question", "cause", "matière",
question => "affaire", "chose",
cause => "chose", "matière",
matière => "cause", "chose"
);
我已经创建了这个脚本,但它并没有很好地工作,我觉得太复杂了。
use Data::Dumper;
@list = (
"affaire,chose,question",
"cause,chose,matière",
);
%te;
for ($a = 0; $a < @list; $a++){
@split_list = split (/,/,$list[$a]);
}
foreach $elt (@split_list){
print "SPLIT ELT : $split_list[$elt]\n";
for ($i = 0; $i < @list; $i++){
$test = $list[$i]; #$test = "affaire,chose,question"
if (exists $te{$split_list[$elt]}){ #if exists affaire in %te
@t = split (/,/,$test); # @t = affaire chose question
print "T : @t\n";
@temp = grep(!/$split_list[$elt]/, @t);
print "GREP : @temp\n";#@temp = chose question
@fin = join(', ', @temp); #@fin = chose, question;
for ($k = 0; $k < @fin; $k++){
$te{$split_list[$elt]} .= $fin[$k]; #affaire => chose, question
}
}
else {
@t = split (/,/,$test); # @t = affaire chose question
print "T : @t\n";
@temp = grep(!/$split_list[$elt]/, @t);
print "GREP : @temp\n";#@temp = chose question
@fin = join(', ', @temp); #@fin = chose, question;
for ($k = 0; $k < @fin; $k++){
$te{$split_list[$elt]} = $fin[$k];
}
}
}
}
print Dumper \%te;
输出:
SPLIT ELT : cause
T : affaire chose question
GREP : affaire chose question
T : cause chose matière
GREP : chose matière
SPLIT ELT : cause
T : affaire chose question
GREP : affaire chose question
T : cause chose matière
GREP : chose matière
SPLIT ELT : cause
T : affaire chose question
GREP : affaire chose question
T : cause chose matière
GREP : chose matière
$VAR1 = {
'cause' => 'affaire, chose, questionchose, matièreaffaire, chose, questionchose, matièreaffaire, chose, questionchose, matière'
};
答案 0 :(得分:3)
对于@list
中的每个元素,将其拆分为,
,并将每个字段用作%te
的键,将其他字段推送到该键的值:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my @list = (
"affaire,chose,question",
"cause,chose,matière",
);
my %te;
foreach my $str (@list) {
my @field = split /,/, $str;
foreach my $key (@field) {
my @other = grep { $_ ne $key } @field;
push @{$te{$key}}, @other;
}
}
print Dumper(\%te);
输出继电器:
$ perl t.pl
$VAR1 = {
'question' => [
'affaire',
'chose'
],
'affaire' => [
'chose',
'question'
],
'matière' => [
'cause',
'chose'
],
'cause' => [
'chose',
'matière'
],
'chose' => [
'affaire',
'question',
'cause',
'matière'
]
};
答案 1 :(得分:2)
我认为我看到你正在尝试做的事情:索引单词之间的语义链接,然后是同义词列表。我对么? :-)
如果一个单词出现在多个同义词列表中,那么对于该单词,您可以创建一个散列条目,并将该单词作为键,并使用原来作为值的同义词作为值...或类似的关键字。使用数组的散列 - 如@Lee Duhem的解决方案 - 您将获得每个关键字的同义词列表(数组)。这是一种常见的模式。你最终会得到很多哈希条目。
我一直在玩@miygawa称为Hash::MultiValue的整洁模块,它采用不同的方法来访问与每个哈希键关联的值列表:多值哈希。一些不错的功能是你可以从多值哈希中动态创建数组引用的哈希值,&#34; flatten&#34;哈希,写回调与->each()
方法,以及其他整洁的东西,所以它非常灵活。我相信该模块没有依赖性(除了测试)。加上@miyagawa(以及其他贡献者),所以使用它并阅读它对你有好处:-)
我不是专家,我不确定它适合你想要的东西 - 作为Lee的方法的变体,你可能有类似的东西:
#!/usr/bin/env perl
use strict;
use warnings;
use Hash::MultiValue;
my $words_hash = Hash::MultiValue->new();
# set up the mvalue hash
for my $words (<DATA>) {
my @synonyms = split (',' , $words) ;
$words_hash->add( shift @synonyms => (@synonyms[0..$#synonyms]) ) ;
};
for my $key (keys %{ $words_hash } ) {
print "$key --> ", join(", ", $words_hash->get_all($key)) ;
};
print "\n";
sub synonmize {
my $bonmot = shift;
my @bonmot_syns ;
# check key "$bonmot" for word to search and show values
push @bonmot_syns , $words_hash->get_all($bonmot);
# now grab values but leave out synonym's synonyms
foreach (keys %{ $words_hash } ) {
if ($_ !~ /$bonmot/ && grep {/$bonmot/} $words_hash->get_all($_)) {
push @bonmot_syns, grep {!/$bonmot/} $words_hash->get_all($_);
}
}
# show the keys with values containing target word
$words_hash->each(
sub { push @bonmot_syns, $_[0] if grep /$bonmot/ , @_[1..$#_] ; }
);
chomp @bonmot_syns ;
print "synonymes pour \"$bonmot\": @bonmot_syns \n" ;
}
# find synonyms
synonmize("chose");
synonmize("truc");
synonmize("matière");
__DATA__
affaire,chose,question
cause,chose,matière
chose,truc,bidule
fille,demoiselle,femme,dame
<强>输出:强>
fille --> demoiselle, femme, dame
affaire --> chose, question
cause --> chose, matière
chose --> truc, bidule
synonymes pour "chose": truc bidule question matière affaire cause
synonymes pour "truc": bidule chose
synonymes pour "matière": chose cause
Tie::Hash::MultiValue
是另一种选择。感谢@Lee提供快速清洁的解决方案: - )