根据匹配将列转换为矩阵

时间:2014-09-17 09:49:04

标签: perl awk

我想将col1和col2转换为矩阵,如果在任何seq程序中出现A应该打印*

Input file:
A   seq1
A   seq3
B   seq1
B   seq2
B   seq3
C   seq1
C   seq2
D   seq1

Required output:
        A   B   C   D
seq1    *   *   *   *
seq2        *   *
seq3    *   *   

3 个答案:

答案 0 :(得分:3)

来自命令行的Perl,

perl -lane'
  $h{$F[1]}{$F[0]} = "*";
  $s{$F[0]}++ or push @r, $F[0]; 
END{ 
  $, = "\t";
  print "", @r;
  print $_, @{ $h{$_} }{@r} for sort keys %h;
}
' file

输出

        A       B       C       D
seq1    *       *       *       *
seq2            *       *
seq3    *       *

答案 1 :(得分:1)

以下是awk

中的操作方法
awk '{a[$1FS$2]="*";b[$1];c[$2]} END {printf "\t";for (j in b) printf "%s\t",j;print "";for (i in c) {printf "%s\t",i;for (j in b) printf "%s\t",a[j FS i];print ""}}' file
        A       B       C       D
seq1    *       *       *       *
seq2            *       *
seq3    *       *

更具可读性:

awk '
    {data[$1FS$2]="*"
    col[$1]
    row[$2]} 
END {printf "\t"
    for (j in col) 
        printf "%s\t",j
    print ""
    for (i in row) {
        printf "%s\t",i
        for (j in col) 
            printf "%s\t",data[j FS i]
        print ""}
    }' file

答案 2 :(得分:0)

您也可以使用此perl脚本:

use strict;
use warnings;
my ( %hash, %hash1 );
open FD, "File_name";
while (<FD>) {
    if (/(\w+)\s+(\w+)/) {
        $hash1{$1} = 0;
        push @{ $hash{$2} }, $1;
    }
}

print "\t$_" foreach ( sort keys %hash1 );

foreach my $val ( sort keys %hash ) {
    $hash1{$_}++ foreach ( @{ $hash{$val} } );
    print "\n$val\t";
    foreach ( sort keys %hash1 ) {
        print "*\t" if ( $hash1{$_} != 0 );
        print "\t" unless ( $hash1{$_} != 0 );
    }
    $hash1{$_} = 0 foreach ( keys %hash1 );
}