Perl基于连续的字段值将行加载到数组中。
我正在尝试编写一个脚本来遍历文件的每一行并查看第二个字段。如果第二个字段与后面的字段匹配,则将整行推入数组。一旦第二个字段遇到与前一个字段不同的值,停止将行推入数组。 然后打印数组。在下面的集合中,将跳过第二个字段值为PUSA的大多数行,但是第二个字段包含WMCE的行将被推送到数组中。
15:15:07.705 "PUSA17122100vx1m" STE
15:15:08.709 "PUSA17122100w9sn" STE
15:50:25.244 "PUSA171221014uk8" STE
15:50:26.509 "PUSA171221014vpo" STE
15:50:26.750 "PUSA171221015j7w" STE
13:58:34.518 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" STE
16:05:31.310 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" STE
16:05:31.310 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" STE
16:05:34.938 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" STE
16:03:35.805 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" EOM
16:03:36.420 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" EOM
15:47:40.061 "PUSA171221015gtm" STE
15:47:41.460 "PUSA171221015mmi" STE
15:47:45.635 "PUSA17122101536p" STE
10:35:50.524 "PUSA171221007k8z" STE
10:40:11.406 "PUSA171221007vwl" STE
13:51:04.820 "PUSS171221000jpu" STE
14:42:50.589 "PUSS17122100193k" STE
09:49:53.111 "PUSA171221002a7g" STE
13:58:34.562 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" STE
16:05:31.302 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" STE
16:05:31.302 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" STE
16:05:34.931 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" STE
16:03:36.396 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" EOM
16:03:35.859 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" EOM
15:15:06.747 "PUSA17122100w7fw" STE
15:15:08.348 "PUSA17122100vrv8" STE
15:15:08.542 "PUSA17122100vzhu" STE
这是我到目前为止所拥有的。我尝试保存第二个(@ $ row [1])字段值,然后将其与下一行的值进行匹配。 但我得到的这些数组中有两行。
#!/usr/bin/perl
use Text::CSV ;
use Time::Local ;
use strict ;
use warnings ;
my $file = $ARGV[0] ;
open my $fh, "<", $file or die "$file: $!" ;
my $csv = Text::CSV->new ({
binary => 1,
auto_diag => 1,
});
while (my $row = $csv->getline ($fh)) {
print "@$row\n" ;
}
答案 0 :(得分:0)
如果我理解正确,你想在两个或多个连续的行具有匹配的第二个字段时打印这些行。
我的解决方案并不要求您展望下一行。而是将行存储到临时数组中,其中最后一行可以轻松匹配。遇到输入的不匹配或结束时,如果临时数组包含多个(匹配的)项,则将其刷新输出。
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV ;
use Time::Local ;
my $file = $ARGV[0] ;
open my $fh, "<", $file or die "$file: $!" ;
my $csv = Text::CSV->new({ binary => 1,
auto_diag => 1,
});
my @temp_row_storage = ();
while (my $row = $csv->getline($fh)) {
if (@temp_row_storage and # if we have stored rows; and
$temp_row_storage[-1][1] ne $row->[1]) { # the last stored row differs
# from the current row
# => print the stored rows if there are at least 2 matching ones
print(map { "@$_\n" } @temp_row_storage) if @temp_row_storage >= 2;
@temp_row_storage = (); # empty - these were already printed
}
# always store the current row
push(@temp_row_storage, $row);
}
# cleanup: print the last batch of rows if there were at least 2 matching ones
print(map { "@$_\n" } @temp_row_storage) if @temp_row_storage >= 2;
输入CSV文件:
15:15:07.705,"PUSA17122100vx1m",STE
15:15:08.709,"PUSA17122100w9sn",STE
15:50:25.244,"PUSA171221014uk8",STE
15:50:26.509,"PUSA171221014vpo",STE
15:50:26.750,"PUSA171221015j7w",STE
13:58:34.518,"WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221",STE
16:05:31.310,"WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221",STE
16:05:31.310,"WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221",STE
16:05:34.938,"WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221",STE
16:03:35.805,"WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221",EOM
16:03:36.420,"WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221",EOM
15:47:40.061,"PUSA171221015gtm",STE
15:47:41.460,"PUSA171221015mmi",STE
15:47:45.635,"PUSA17122101536p",STE
10:35:50.524,"PUSA171221007k8z",STE
10:40:11.406,"PUSA171221007vwl",STE
13:51:04.820,"PUSS171221000jpu",STE
14:42:50.589,"PUSS17122100193k",STE
09:49:53.111,"PUSA171221002a7g",STE
13:58:34.562,"WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221",STE
16:05:31.302,"WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221",STE
16:05:31.302,"WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221",STE
16:05:34.931,"WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221",STE
16:03:36.396,"WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221",EOM
16:03:35.859,"WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221",EOM
15:15:06.747,"PUSA17122100w7fw",STE
15:15:08.348,"PUSA17122100vrv8",STE
15:15:08.542,"PUSA17122100vzhu",STE
输出:
13:58:34.518 WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221 STE
16:05:31.310 WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221 STE
16:05:31.310 WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221 STE
16:05:34.938 WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221 STE
16:03:35.805 WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221 EOM
16:03:36.420 WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221 EOM
13:58:34.562 WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221 STE
16:05:31.302 WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221 STE
16:05:31.302 WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221 STE
16:05:34.931 WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221 STE
16:03:36.396 WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221 EOM
16:03:35.859 WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221 EOM
编辑:
如果要保留/返回第二个字段周围的引号,可以使用
print(map { qq!$_->[0] "$_->[1]" $_->[2]\n! } @temp_row_storage)
if @temp_row_storage >= 2;
或
print(map { sprintf(qq!%s "%s" %s\n!, @$_) } @temp_row_storage)
if @temp_row_storage >= 2;
但在这两种情况下,您必须知道每行将包含三个字段,以使实现可靠地工作。
答案 1 :(得分:-1)
试试这个。如果这不是您想要的,请澄清您的问题
use strict ;
use warnings ;
open (IN, $ARGV[0]);
my $prev_row = "";
my @rec;
while (my $row = <IN>) {
@rec = split(" ", $row);
if ($prev_row eq "") {
print "$row" ;
} else {
if($rec[1] eq $prev_row) {
#skip
} else {
print "$row";
}
}
$prev_row = $rec[1];
}
输出:
15:15:07.705 'PUSA17122100vx1m' STE
15:15:08.709 "PUSA17122100w9sn" STE
15:50:25.244 "PUSA171221014uk8" STE
15:50:26.509 "PUSA171221014vpo" STE
15:50:26.750 "PUSA171221015j7w" STE
13:58:34.518 "WMCEQ42PRD_NX:EQ-58661535-751d143b2002:171221" STE
15:47:40.061 "PUSA171221015gtm" STE
15:47:41.460 "PUSA171221015mmi" STE
15:47:45.635 "PUSA17122101536p" STE
10:35:50.524 "PUSA171221007k8z" STE
10:40:11.406 "PUSA171221007vwl" STE
13:51:04.820 "PUSS171221000jpu" STE
14:42:50.589 "PUSS17122100193k" STE
09:49:53.111 "PUSA171221002a7g" STE
13:58:34.562 "WMCEQ42PRD_NX:EQ-58661583-a62e3e5ad011:171221" STE
15:15:06.747 "PUSA17122100w7fw" STE
15:15:08.348 "PUSA17122100vrv8" STE
15:15:08.542 "PUSA17122100vzhu" STE