来自encode-function的check-argument

时间:2011-02-22 14:55:36

标签: perl encode

为什么我从第二个循环(CHECK-argument set)获得不同的输出?

#!/usr/bin/env perl
use warnings;
use 5.012;
use Encode qw(encode);
my $s = 'a';

for my $encoding ( 'iso-8859-1', 'iso-8859-15', 'cp1252', 'cp850' ) {
    my $encoded = encode( $encoding, $s );
    my $c = unpack '(B8)*', $encoded;
    printf "%-12s:\t%8s\n", $encoding, $c;
}

say "-------------------";

for my $encoding ( 'iso-8859-1', 'iso-8859-15', 'cp1252', 'cp850' ) {
    my $encoded = encode( $encoding, $s, Encode::FB_WARN );
    my $c = unpack '(B8)*', $encoded;
    printf "%-12s:\t%8s\n", $encoding, $c;
}


# iso-8859-1  :   01100001
# iso-8859-15 :   01100001
# cp1252      :   01100001
# cp850       :   01100001
# -------------------
# iso-8859-1  :   01100001
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# iso-8859-15 :           
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# cp1252      :           
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# cp850       :   

3 个答案:

答案 0 :(得分:4)

documentation中描述了该行为(请参阅下面的剪辑) - 它会修改数据并在$s中保留未处理的部分。由于没有错误,它基本上清除了你的变量。

*CHECK* = Encode::FB_QUIET
  If *CHECK* is set to Encode::FB_QUIET, (en|de)code will immediately
  return the portion of the data that has been processed so far when an
  error occurs. The data argument will be overwritten with everything
  after that point (that is, the unprocessed part of data). This is
  handy when you have to call decode repeatedly in the case where your
  source data may contain partial multi-byte character sequences, (i.e.
  you are reading with a fixed-width buffer). Here is a sample code that
  does exactly this:

    my $buffer = ''; my $string = '';
    while(read $fh, $buffer, 256, length($buffer)){
      $string .= decode($encoding, $buffer, Encode::FB_QUIET);
      # $buffer now contains the unprocessed partial character
    }

*CHECK* = Encode::FB_WARN
  This is the same as above, except that it warns on error. Handy when
  you are debugging the mode above.

答案 1 :(得分:1)

CHECK设置为Encode::FB_QUIET时,数据参数将被覆盖:

perl -MEncode -Mutf8 -E '$s="a"; encode("utf-8", $s, Encode::FB_WARN); say $s'

答案 2 :(得分:1)

您可以通过Encode :: LEAVE_SRC

中的oring来阻止覆盖
my $encoded = encode( $encoding, $s, Encode::FB_WARN | Encode::LEAVE_SRC);