如何匹配perl中数组元素的缺失?

时间:2014-09-26 17:06:53

标签: arrays regex perl

我试图通过使用数组元素来匹配数据上的缺席字。 我的代码是

use warnings;
use strict;
my @ar = qw(one two three four five six seven eight nine ten);
my @data = <DATA>;
print "Absence word in the data\n";
foreach my $mat(@ar){
    my $nonmatch;
    foreach my $dat (@data){
        $nonmatch = grep{m/(?!$mat)/} $dat;
    }
    print "$nonmatch\n";
}
__DATA__
eight two four one two three four seven eight ten one two seven 

首先引用数据数组元素上的数组元素值在仅打印的数据中不存在。

我预计输出是:

Absence word in the data
five
six
nine 

我该怎么做

6 个答案:

答案 0 :(得分:2)

使用perlfaq4 - How can I tell whether a certain element is contained in a list or array?

中建模的看见样式哈希
use warnings;
use strict;

my %seen = map { $_ => 1 } map { split ' ' } <DATA>;

my @ar = qw(one two three four five six seven eight nine ten);

print "Absence word in the data\n";
print "$_\n" for grep { !$seen{$_} } @ar;

__DATA__
eight two four one two three four seven eight ten one two seven 

输出:

Absence word in the data
five
six
nine

答案 1 :(得分:1)

您可以使用哈希切片@seen{@r}@r哈希中存储来自%seen的所有单词,并稍后检查这些哈希键对@ar数组,

use warnings;
use strict;

my @ar = qw(one two three four five six seven eight nine ten);
my %seen;
while (my $mat = <DATA>) {
    my @r = split (' ', $mat);
    @seen{@r} = ();
}
print "Absence word in the data\n";
print "$_\n" for grep { not exists $seen{$_} } @ar;

__DATA__
eight two four one two three four seven eight ten one two seven 

输出

Absence word in the data
five
six
nine

答案 2 :(得分:1)

这听起来像是我曾经遇到的一个问题,而我提出的代码是我根据此页面上的信息创建的以下代码:

https://www.safaribooksonline.com/library/view/perl-cookbook/1565922433/ch04s08.html

# assume @A and @B are already loaded

%seen  = ();                     # lookup table to test membership of B
@aonly = ();                     # answer

# build lookup table
$seen{$_} = 1 for @B;

# find elements only in @A and not in @B
for ( @A ) {
    push @aonly, $_ unless $seen{$_};
}

答案 3 :(得分:1)

创建一个哈希,其中包含来自__DATA__的所有单词作为键(可以使用哈希切片在一行中完成),然后过滤不在哈希中的单词(也可以使用{{1}在一行中完成})。

grep

答案 4 :(得分:0)

此解决方案首先列出您要查找的内容,并删除您在此过程中看到的所有内容,然后打印剩下的内容。

如果%unseen哈希中仍有任何键,则可以通过检查while循环的底部来优化大数据。我在你的测试数据中添加了另一行和“十六”这个词,以确保它与多行一起工作,并且我们没有在那里得到“六”的误报。

use warnings;
use strict;

my @to_match = qw/ one two three four five six seven eight nine ten /;
my %unseen;
$unseen{$_} = 1 for @to_match;
while (my $line = <DATA>) {
    foreach my $match_this (@to_match) {
        delete $unseen{$match_this} if $line =~/\b$match_this\b/;
    }
}
print "Words absent from the data:\n". join "\n", keys %unseen;
print "\n";
__DATA__
eight two four one two three four seven eight ten one two seven
sixteen

答案 5 :(得分:0)

两件事:

始终chomp您所读的内容。其中包括__DATA__

my @data = <DATA>;   # The NL is in each element
chomp @data;         # Now it isn't!

如果您没有选择,则需要检查one是否与one\n匹配。此外,由于您将整个__DATA__放在一行上,因此它将被读作一行输入。您必须使用split将其分成数组。

第二件事:通常,当你问这是lis?t 类型的问题时,你应该立即想到哈希。哈希可以快速用于查找项目。在这种情况下,您可以对数据进行哈希处理,然后验证列表中的每个项目是否都在该哈希中:

#! /usr/bin/env perl
#

use strict;
use warnings;
use feature qw(say);

my @list = qw(one two three four five six seven eight nine ten);
my @data = <DATA>;
chomp @data;        # Don't forget!

#
# Translate your input as a hash
#

my %data_hash;
for my $element (@data) {
    $data_hash{$element} = 1;
}

for my $element (@list) {
    if ( not exists $data_hash{$element} ) {
        say "$element isn't in the list";
    }
}
__DATA__
eight
two
four
one
two
three
four
seven
eight
ten
one
two
seven

请注意,map命令为您提供了编写此循环的更短方法:

#
# Translate your input as a hash
#

my %data_hash;
for my $element (@data) {
    $data_hash{$element} = 1;
}

现在可以缩短为一行:

#
# Translate your input as a hash
#

my %data_hash =  map { $_ => 1 } @data;

这是将数组转换为哈希的常用方法,因此大多数开发人员只会使用它。