删除包含perl中重复正则表达式的行

时间:2012-02-23 11:37:51

标签: regex perl duplicates unique

我有一个包含以下元素的数组:

@array = qw/ john jim rocky hosanna/;

INPUT FILE:

john wears blue shirt 

hosanna knows drawing

george and jim went to europe

john went to swimming

jim wears yellow shirt

rocky went to swimming

rocky learns painting

hosanna learns painting

必需的输出:

john wears blue shirt 

hosanna knows drawing

george and jim went to europe

rocky went to swimming

所以,我需要只有第一次出现的行。

4 个答案:

答案 0 :(得分:4)

@seen{@array} = ();
@out = grep { (($w)=split; !($seen{$w}++) } @in;

答案 1 :(得分:1)

如何制作另一个数组,表明该名称是否已被使用?然后,第一次使用Jim读取行时,将此数组中的变量设置为使用并写入输出。如果它已经在过去使用过,那就什么都不做。

@array =(john,jim,rocky,hosanna);
@used =(0,0,0,0);

答案 2 :(得分:1)

一种方式。我将数组数据保存到哈希并删除输入文件中的条目。

script.pl的内容:

use warnings;
use strict;

## Input names to search.
my @array = qw/ john jim rocky hosanna/;

## Save names to a hash. This way they are easier to find out.
my %names = map { $_ => 1 } @array;

## Read file line by line.
while ( <> ) { 

    ## Avoid blank lines.
    next if m/\A\s*\Z/;

    ## Split line in fields.
    my @f = split;

    ## Count number of names in hash.
    my $num_entries = scalar keys %names;

    ## Remove words of hash found in line.
    for ( @f ) { 
        delete $names{ $_ };
    }   

    ## If now there are less names, it means that line had any of
    ## them, so print line.
    if ( scalar keys %names < $num_entries ) { 
        printf qq[%s\n], $_; 
    }   

    ## If hash is empty, there are no lines left to print, so exit of
    ## loop without checking more lines.
    last if scalar keys %names == 0;
}

命令:

perl script.pl infile

输出:

john wears blue shirt 

hosanna knows drawing

george and jim went to europe

rocky went to swimming

答案 3 :(得分:1)

perl -ane 'print unless $a{$F[0]}++ ' inputfile

希望这有效+