我想对文本文件中的某些字符串执行以下操作。我得到了我需要的基本功能的帮助,但现在我想对重复的行进行排序并仅打印一次。
我想从字符串中删除Z,ZN和LVT并对它们进行排序。
输入:
abchsfk/jshflka/ZN (cellLVT)
abchsfk/jshflka/ZN (cellLVT)
asjkfsa/sfklfkshfsf/Z (mobLVT)
asjkfsa/sfklfkshfsf/Z (mobLVT)
asjhfdjkfd/sjfdskjfhdk/hsakfshf/Z (celLVT)
asjhfdjkfd/sjfdskjfhdk/hsakfshf/Z (celLVT)
asjhdjs/jhskjds/ZN (abcLVT)
asjhdjs/jhskjds/ZN (abcLVT)
shdsjk/jhskd/ZN (xyzLVT)
shdsjk/jhskd/ZN (xyzLVT)
期望的输出:
abchsfk/jshflka cell
asjkfsa/sfklfkshfsf mob
asjhfdjkfd/sjfdskjfhdk/hsakfshf cel
asjhdjs/jhskjds abc
shdsjk/jhskd xyz
代码:
if ($line =~ /LVT/ && ($line =~ /ZN/ || $line =~ /Z/) )
#### matches the words LVT and ( Z or ZN)
{
$line =~ s/\/ZN?|\(|LVT\)//g;
my @line_out = $line;
$lvt_out = sort::$line_out();
print OUT " $lvt_out \n";
}
答案 0 :(得分:4)
只需使用map
以及List::MoreUtils
中的uniq
即可完成此操作。我从您的评论中看到,您实际上并不想对这些数据进行排序,因此我将其排除在外
此程序需要输入文件的路径作为命令行上的参数
use strict;
use warnings;
use List::MoreUtils qw/ uniq /;
my @rows = uniq map { m| ([\w/]+)/ZN? \s+ \((\w+)LVT\) |x ? "$1\t$2" : () } <>;
printf "%-31s %s\n", split /\t/ for @rows;
abchsfk/jshflka cell
asjkfsa/sfklfkshfsf mob
asjhfdjkfd/sjfdskjfhdk/hsakfshf cel
asjhdjs/jhskjds abc
shdsjk/jhskd xyz
答案 1 :(得分:1)
你实际上并没有在那里整理任何东西。获得你的输出:
#!/usr/bin/env perl
use strict;
use warnings;
my %seen;
while ( my $line = <DATA> ) {
if ( $line =~ /LVT/ && ( $line =~ /ZN/ || $line =~ /Z/ ) )
#### matches the words LVT and ( Z or ZN)
{
$line =~ s/\/ZN?|\(|LVT\)//g;
print $line unless $seen{$line}++;
}
}
__DATA__
abchsfk/jshflka/ZN (cellLVT)
abchsfk/jshflka/ZN (cellLVT)
asjkfsa/sfklfkshfsf/Z (mobLVT)
asjkfsa/sfklfkshfsf/Z (mobLVT)
asjhfdjkfd/sjfdskjfhdk/hsakfshf/Z (celLVT)
asjhfdjkfd/sjfdskjfhdk/hsakfshf/Z (celLVT)
asjhdjs/jhskjds/ZN (abcLVT)
asjhdjs/jhskjds/ZN (abcLVT)
shdsjk/jhskd/ZN (xyzLVT)
shdsjk/jhskd/ZN (xyzLVT)
这给出了:
abchsfk/jshflka cell
asjkfsa/sfklfkshfsf mob
asjhfdjkfd/sjfdskjfhdk/hsakfshf cel
asjhdjs/jhskjds abc
shdsjk/jhskd xyz
如果您认真对待它们进行排序 - 您使用的标准是什么?一个简单的字母数字排序:
print sort keys %seen;
给出:
abchsfk/jshflka cell
asjhdjs/jhskjds abc
asjhfdjkfd/sjfdskjfhdk/hsakfshf cel
asjkfsa/sfklfkshfsf mob
shdsjk/jhskd xyz