我想将col1和col2转换为矩阵,如果在任何seq程序中出现A应该打印*
Input file:
A seq1
A seq3
B seq1
B seq2
B seq3
C seq1
C seq2
D seq1
Required output:
A B C D
seq1 * * * *
seq2 * *
seq3 * *
答案 0 :(得分:3)
来自命令行的Perl,
perl -lane'
$h{$F[1]}{$F[0]} = "*";
$s{$F[0]}++ or push @r, $F[0];
END{
$, = "\t";
print "", @r;
print $_, @{ $h{$_} }{@r} for sort keys %h;
}
' file
输出
A B C D
seq1 * * * *
seq2 * *
seq3 * *
答案 1 :(得分:1)
以下是awk
awk '{a[$1FS$2]="*";b[$1];c[$2]} END {printf "\t";for (j in b) printf "%s\t",j;print "";for (i in c) {printf "%s\t",i;for (j in b) printf "%s\t",a[j FS i];print ""}}' file
A B C D
seq1 * * * *
seq2 * *
seq3 * *
更具可读性:
awk '
{data[$1FS$2]="*"
col[$1]
row[$2]}
END {printf "\t"
for (j in col)
printf "%s\t",j
print ""
for (i in row) {
printf "%s\t",i
for (j in col)
printf "%s\t",data[j FS i]
print ""}
}' file
答案 2 :(得分:0)
您也可以使用此perl脚本:
use strict;
use warnings;
my ( %hash, %hash1 );
open FD, "File_name";
while (<FD>) {
if (/(\w+)\s+(\w+)/) {
$hash1{$1} = 0;
push @{ $hash{$2} }, $1;
}
}
print "\t$_" foreach ( sort keys %hash1 );
foreach my $val ( sort keys %hash ) {
$hash1{$_}++ foreach ( @{ $hash{$val} } );
print "\n$val\t";
foreach ( sort keys %hash1 ) {
print "*\t" if ( $hash1{$_} != 0 );
print "\t" unless ( $hash1{$_} != 0 );
}
$hash1{$_} = 0 foreach ( keys %hash1 );
}