所以我想说我有字符串。
$my str = "Hello how are you today. Oh thats good I'm glad you are happy. Thats wonderful; thats fantastic."
我想创建一个哈希表,其中每个键都是一个唯一的单词,值是它在字符串中出现的次数,即我希望它是一个自动过程。
my %words {
"Hello" => 1,
"are" => 2,
"thats" => 2,
"Thats" => 1
};
老实说,我是PERL的新手,并不知道如何做到这一点,如何处理标点符号等。
更新:
此外,是否可以使用
split('.!?;',$mystring)
没有这种语法,但基本上分为一个。要么 !要么 ?等等。哦和''(空白)
答案 0 :(得分:4)
一种简单的方法是split
在视图中不是有效单词字符的任何字符上的字符串。请注意,这绝不是一个详尽的解决方案。我只是采用了一组有限的角色。
您可以在发现边缘情况时在方括号[ ... ]
中添加有效的字词。您也可以在http://search.cpan.org中搜索为此目的而设计的模块。
正则表达式[^ ... ]
表示匹配括号内不的任何字符。 \pL
是字母的较大子集,其他字面是字面的。必须对短划线-
进行转义,因为它是字符类括号内的元字符。
use strict;
use warnings;
use Data::Dumper;
my $str = "Hello how are you today. Oh thats good I'm glad you are happy.
Thats wonderful; thats fantastic.";
my %hash;
$hash{$_}++ # increase count for each field
for # in the loop
split /[^\pL'\-!?]+/, $str; # over the list from splitting the string
print Dumper \%hash;
<强>输出:强>
$VAR1 = {
'wonderful' => 1,
'glad' => 1,
'I\'m' => 1,
'you' => 2,
'how' => 1,
'are' => 2,
'fantastic' => 1,
'good' => 1,
'today' => 1,
'Hello' => 1,
'happy' => 1,
'Oh' => 1,
'Thats' => 1,
'thats' => 2
};
答案 1 :(得分:1)
这将使用空格来分隔单词。
#!/usr/bin/env perl
use strict;
use warnings;
my $str = "Hello how are you today."
. " Oh thats good I'm glad you are happy."
. " Thats wonderful. thats fantastic.";
# Use whitespace to split the string into single "words".
my @words = split /\s+/, $str;
# Store each word in the hash and count its occurrence.
my %hash;
for my $word ( @words ) {
$hash{ $word }++;
}
# Show each word and its count. Using printf to align output.
for my $key ( sort keys %hash ) {
printf "\%-10s => \%d\n", $key, $hash{ $key };
}
你需要进行微调以获得“真实”的话语。
Hello => 1
I'm => 1
Oh => 1
Thats => 1
are => 2
fantastic. => 1
glad => 1
good => 1
happy. => 1
how => 1
thats => 2
today. => 1
wonderful. => 1
you => 2
答案 2 :(得分:1)
试试这个:
use strict;
use warnings;
my $str = "Hello, how are you today. Oh thats good I'm glad you are happy.
Thats wonderful.";
my @strAry = split /[:,\.\s\/]+/, $str;
my %strHash;
foreach my $word(@strAry)
{
print "\nFOUND WORD: ".$word;
my $exstCnt = $strHash{$word};
if(defined($exstCnt))
{
$exstCnt++;
}
else
{
$exstCnt = 1;
}
$strHash{$word} = $exstCnt;
}
print "\n\nNOW REPORTING UNIQUE WORDS:\n";
foreach my $unqWord(sort(keys(%strHash)))
{
my $cnt = $strHash{$unqWord};
print "\n".$unqWord." - ".$cnt." instances";
}
答案 3 :(得分:0)
use YAML qw(Dump);
use 5.010;
my $str = "Hello how are you today. Oh thats good I'm glad you are happy. Thats wonderful; thats fantastic.";
my @match_words = $str =~ /(\w+)/g;
my $word_hash = {};
foreach my $word (sort @match_words) {
$word_hash->{$word}++;
}
say Dump($word_hash);
# -------output----------
Hello: 1
I: 1
Oh: 1
Thats: 1
are: 2
fantastic: 1
glad: 1
good: 1
happy: 1
how: 1
m: 1
thats: 2
today: 1
wonderful: 1
you: 2