Question

我是Perl的新手，目前，我正在使用Perl进行一些文本处理。输入文件中有四列，由制表符分隔。我想找到第3列的最小值和第4列的最大值，并将它们放在一行中以获得相同的ID。下面显示了输入文件的外观：

A   A1  1  5
A   A1  9  18
A   A1  23 40
A   A2  20 30
A   A2  35 43
B   A1  2  10
B   A1  12 30
B   A1  35 100
C   A9  2  40
C   A9  45 70

我想要的输出：

A   A1 1  40
A   A2 23 43
B   A1 2  100
C   A9 2  70

Answer 1

来自命令行的Perl，

perl -anE'
  $k = join "\t", @F[0,1];
  $h{$k} or push @r, $k;
  (!defined or $_ >$F[2]) and $_ = $F[2] for $h{$k}{m};
  ($_ <$F[3])             and $_ = $F[3] for $h{$k}{M};
}{
  say join "\t", $_, @{$h{$_}}{qw(m M)} for @r
' file

输出

A       A1      1       40
A       A2      20      43
B       A1      2       100
C       A9      2       70

Answer 2

这样的东西？

use strict;
use warnings;

open my $fh, '<', 'input-data.txt';

# Keep track of the current minimum and maximum
# values while we read the file.
#
my (%val1_min, %val2_max);

while (<$fh>)   ## loop through lines of file
{
  chomp;        ## remove trailing "\n" character

  # Split on sequences of whitespace
  #
  my ($key1, $key2, $val1, $val2) = split /\s+/;

  # Record a new minimum if there is no old
  # minimum, or if the old minimum is higher
  # than the current value.
  #
  $val1_min{$key1}{$key2} = $val1
    if !defined($val1_min{$key1}{$key2})
    or $val1_min{$key1}{$key2} > $val1;

  # Record a new maximum if there is no old
  # maximum, or if the old maximum is lower
  # than the current value.
  #
  $val2_max{$key1}{$key2} = $val2
    if !defined($val2_max{$key1}{$key2})
    or $val2_max{$key1}{$key2} < $val2;
}

# Now we need to produce some output.
#    

# Loop through the first level of keys.
#
for my $key1 (sort keys %val1_min)
{
  # Loop through the second level of keys.
  #
  for my $key2 (sort keys %{$val1_min{$key1}})
  {
    # Print a line of output to STDOUT.
    #
    printf(
      "%-04s %-04s %3d %3d\n",   ## formatting string
      $key1,                     ## first key
      $key2,                     ## second key
      $val1_min{$key1}{$key2},   ## minimum first value
      $val2_max{$key1}{$key2},   ## maximum second value
    );
  }
}

Answer 3

使用命令行perl：

perl -MList::Util=max,min -lane '
    $k = join "\t", splice @F, 0, 2;
    push @k, $k if !$v{$k};
    push @{$v{$k}[$_]}, $F[$_] for (0..$#F);
  }{
    print join "\t", $_, min(@{$v{$_}[0]}), max(@{$v{$_}[1]}) for @k;
  ' file.txt

输出：

A       A1      1       40
A       A2      20      43
B       A1      2       100
C       A9      2       70

Answer 4

逐行读取数据文件，使用前两列的组合作为记录散列的键，并在该散列中重新列入最小列3和最大列4。如果你想保持这些键的顺序，也可以将它们推送到数组。

#!/usr/bin/perl

use strict;
use warnings;

use feature qw(switch say);

use Data::Dumper;

my (%record, @key);

while (<>) {
    chomp;
    my @field = split /\s+/;
    my $key = join "\t", @field[0,1];
    push @key, $key unless $record{$key};
    if (!$record{$key}{min} || $record{$key}{min} > $field[2]) {
        $record{$key}{min} = $field[2];
    }
    if (!$record{$key}{max} || $record{$key}{max} < $field[3]) {
        $record{$key}{max} = $field[3];
    }
}

for my $key (@key) {
    print (join "\t", $key, $record{$key}{min}, $record{$key}{max}, "\n");
}

如何使用perl将第二行与单个文件中的第一行进行比较？

4 个答案: