我有一个输入file1.txt,如下所示:
file1.txt:
test 1 Vertical 564N,
test 2 Vertical 551N,
test 3 Hydrophone 127N, 223D, 344D,
test 4 Hydrophone 350D,
test 6 Hydrophone 407D,
如何仅获取第4列匹配的值,然后将其分类为以下内容?
Output :
N D
564 223
551 344
127 350
407
答案 0 :(得分:2)
在GNU awk中:
$ awk '
BEGIN {
FS=",? *" #
OFS="\t" # output fields separated by tabs
}
{
for(i=4;i<=NF;i++) # process fields 4 and above
if($i~/N/) # hash n and d accordingly
n[++j]=$i
else if($i~/D/)
d[++k]=$i
}
END {
n[j=0]="N" # headers
d[k=0]="D"
while(n[j]!=""||d[k]!="") # output them untile they run out
print substr(n[j],1,length(n[j++]-1)),substr(d[k],1,length(d[k++]-1))
}' file
输出:
N D
564 223
551 344
127 350
407
答案 1 :(得分:1)
还有一个Perl版本,其输出格式与您的格式相同:
#!/usr/bin/perl
use warnings;
use strict;
use List::MoreUtils qw/zip/;
my %nums;
while (<>) {
my @f = split /\s+/;
for my $nd (@f[3..$#f]) {
if ($nd =~ /^(\d+)([ND])/) {
push @{$nums{$2}}, $1;
}
}
}
print "N D\n";
my @pairs = zip @{$nums{N}}, @{$nums{D}};
while (@pairs) {
my ($n, $d) = (shift @pairs, shift @pairs);
printf "%-3s %-3s\n", $n//"", $d//"";
}
编辑:打高尔夫球并玩耍之后:
#!/usr/bin/perl
use warnings;
use strict;
use List::MoreUtils qw/zip6/;
my %nums = (N => [ "N" ], D => [ "D" ]);
while (<>) {
my @f = split /\s+/;
for my $nd (@f[3..$#f]) {
push @{$nums{$2}}, $1 if $nd =~ m/^(\d+)([ND])/;
}
}
printf "%-3s %-3s\n", $_->[0]//"", $_->[1]//"" for zip6 @{$nums{N}}, @{$nums{D}};
答案 2 :(得分:0)
这是一个Perl解决方案
读取输入文件,并从每一行中丢弃前三列。搜索剩下的内容以查找看起来像999N
或999D
的字段,并将它们分成以N
和D
作为键的数组的哈希值
对数组进行排序后,它们在适当的标题下方显示为列
use strict;
use warnings 'all';
use List::Util 'max';
open my $fh, '<', 'file1.txt' or die $!;
my %data;
while ( <$fh> ) {
my @fields = split ' ', $_, 4;
push @{ $data{$2} }, $1 while $fields[3] =~ /(\d+)([ND])/g;
}
$_ = [ sort { $a <=> $b } @$_ ] for values %data;
my $last = max map $#$_, values %data;
my $fmt = "%-3s %-3s\n";
printf $fmt, qw/ N D /;
for my $i ( 0 .. $last ) {
printf $fmt, map { $_->[$i] // '' } @data{qw/ N D /};
}
N D
127 223
551 344
564 350
407
答案 3 :(得分:-1)
这应该有帮助。
from itertools import izip_longest
#Python3
#from itertools import zip_longest
N = []
D = []
with open(filename) as infile:
for line in infile: #Iterate Each line
val = line.split()[3:] #Get Required value using slicing
for i in val:
if i.endswith("N,"):
N.append(int(i.rstrip("N,")))
else:
D.append(int(i.rstrip("D,")))
for i in izip_longest(sorted(N, reverse=True), sorted(D), fillvalue=""): #Sort and iterate.
print(i)
564, 223
551, 344
127, 350
'', 407