如何确定perl中的数字列表的按位二进制分数?

时间:2013-06-17 16:58:46

标签: perl binary

我正在尝试互换一些十进制和二进制数字。我正在使用以下格式生成的数据:

Example decimal: 163,   Corresponding binary: 10100011

Binary table key:

enter image description here

...and the corresponding description for the binary number in question:

enter image description here

我希望能够取十进制数,将其转换为二进制数,然后使用此查找表打印给定小数的属性列表。我可以使用以下代码将我的十进制转换为二进制:

sub dec2bin {
    my $str = unpack("B32", pack("N", shift));
    $str =~ s/^0+(?=\d)//;   # otherwise you'll get leading zeros
    return $str;
}

但是后来看不到如何使用查找表。问题是,我有专门设计为与此表兼容的二进制数,例如1000011,10000011,101110011,但我只是看不到如何使用这些二进制文件来描述它们。它们甚至有不同的长度!

有人可以帮我理解这里发生了什么吗?

编辑:这是我发现的另一个查找表...也许这更准确/有帮助?它看起来与我相同,但来自该软件的官方website

enter image description here

3 个答案:

答案 0 :(得分:2)

该表位于16位,所以只需转换为base 2(我从其他论坛复制/粘贴了表格,如果与截图不同,请修复):

0000000001 the read is paired in sequencing
0000000010 the read is mapped in a proper pair
0000000100 the query sequence itself is unmapped
0000001000 the mate is unmapped
0000010000 strand of the query (1 for reverse)
0000100000 strand of the mate
0001000000 the read is the first read in a pair
0010000000 the read is the second read in a pair

等...

要获得格式的相关描述,请使用以下代码:

my @descriptions = ( 
   "the read is paired in sequencing"
  ,"the read is mapped in a proper pair"
  #...
);
check_number(163); # Note that you don't need to convert to binary :)

sub check_number {
    my $number = shift;
    my $bitmask = 1; # will keep incrementing it by *2 every time
    for($i=0; $i < @descriptions; $i++) {
        my $match = $bitmask & $number ? 1 : 0; # is the bit flipped on?
        print "|$match| $descriptions[$i] | \n";
        $bitmask *= 2; # or bit-shift - faster but less readable.
    }
}

我的测试代码的输出是(抱歉,得到了懒惰的复制/粘贴描述字符串,因此伪造了它们):

$ perl5.8 17152880.pl
|1| the read is paired in sequencing |
|1| the read is mapped in a proper pair |
|0| 3 |
|0| 4 |
|0| 5 |
|1| 6 |
|0| 7 |
|1| 8 |
|0| 9 |

如果您只想打印匹配的描述,请将循环中的print语句更改为print "$descriptions[$i]\n" if $match;

这种方法的好处是可以轻松扩展到更长的描述表

答案 1 :(得分:1)

任何更简单的方法可能只是检查地图中的每个键,并将其直接与转换后的数字进行比较。

sub get_descriptions {
   my $binary_num = shift;
   my @descriptions; 

   for my $k (keys %description_map) {
      # bitwise comparison
      if( $k & $binary_num ) {
         # add description because this bit is set
         push @descriptions, $description_map{$k};
      }
   }

   # full listing of all descriptions for the set bits
   return @descriptions; 
}

答案 2 :(得分:1)

转换数字后,其输入中表示的基数无关紧要。在内部,将其视为一个数字。

值163表示位域,也就是说,它的每个位都是一些肯定 - 否定的答案,表格会告诉您如何排列问题。

您可以使用subs给出人类可读的名称,如

sub read_is_paired { $_[0] & 0x0001 }
sub read_is_mapped { $_[0] & 0x0002 }
sub strand_of_mate { $_[0] & 0x0020 }
sub read_is_2nd    { $_[0] & 0x0080 }

然后解码位域类似于

my $flags = 163;
print "read is paired?  ", read_is_paired($flags) ? "YES" : "NO", "\n",
      "read is mapped?  ", read_is_mapped($flags) ? "YES" : "NO", "\n",
      "strand of mate = ", strand_of_mate($flags) ? "1"   : "0",  "\n",
      "read is second?  ", read_is_2nd($flags)    ? "YES" : "NO", "\n";

输出:

read is paired?  YES
read is mapped?  YES
strand of mate = 1
read is second?  YES