我有一个文件(比如bugs.txt
),它是通过运行一些代码生成的。此文件包含JIRAS列表。我想编写一个代码,可以删除此文件中的重复条目。
逻辑应该是通用的,因为bugs.txt文件每次都会不同。
示例输入文件bugs.txt
:
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
示例输出:
BUG-111, BUG-122, BUG-123, JIRA-221, JIRA-234
我的试用代码:
my $file1="/path/to/file/bugs.txt";
my $Jira_nums;
open(FH, '<', $file1) or die $!;
{
local $/;
$Jira_nums = <FH>;
}
close FH;
我需要帮助设计从文件bugs.txt中删除重复条目的逻辑
答案 0 :(得分:1)
You just need to add these lines to your script:
my %seen;
my @no_dups = grep{!$seen{$_}++}split/,?\s/,$Jira_nums;
You'll get:
use strict;
use warnings;
use Data::Dumper;
my $file1="/path/to/file/bugs.txt";
my $Jira_nums;
open(my $FH, '<', $file1) or die $!; # use lexical file handler
{
local $/;
$Jira_nums = <$FH>;
}
my %seen;
my @no_dups = grep{!$seen{$_}++}split/,?\s/,$Jira_nums;
say Dumper \@no_dups;
For input data like:
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221
it gives:
$VAR1 = [
'BUG-111',
'BUG-122',
'BUG-123',
'JIRA-221',
'JIRA-234'
];
答案 1 :(得分:0)
你可以试试这个:
use strict;
use warnings;
my @bugs = "";
@bugs = split /\,?(\s+)/, $_ while(<DATA>);
my @Sequenced = map {$_=~s/\s*//g; $_} RemoveDup(@bugs);
print "@Sequenced\n";
sub RemoveDup { my %checked; grep !$checked{$_}++, @_; }
__DATA__
BUG-111, BUG-122, BUG-123, BUG-111, BUG-123, JIRA-221, JIRA-234, JIRA-221