如何将UTF-8字符串与Perl的printf正确对齐?

时间:2010-01-14 14:04:05

标签: perl unicode encoding

什么是获得漂亮输出的正确方法(所有行都是相同的缩进)?

#!/usr/bin/env perl
use warnings;
use strict;
use DBI;

my $phone_book = [ [ qw( name number ) ],
            [ 'Kroner', 123456789 ],
            [ 'Holler', 123456789 ],
            [ 'Mühßig', 123456789 ],
            [ 'Singer', 123456789 ],
            [ 'Maurer', 123456789 ],
];

my $dbh = DBI->connect( "DBI:CSV:", { RaiseError => 1 } );
$dbh->do( qq{ CREATE TEMP TABLE phone_book AS IMPORT( ? ) }, {}, $phone_book );

my $sth = $dbh->prepare( qq{ SELECT name, number FROM phone_book } );
$sth->execute;

my $array_ref = $sth->fetchall_arrayref();

for my $row ( @$array_ref ) {
    printf "%9s %10s\n", @$row;
}

# OUTPUT:

#   Kroner  123456789
#   Holler  123456789
# Mühßig  123456789
#   Singer  123456789
#   Maurer  123456789

3 个答案:

答案 0 :(得分:4)

我无法重现它,但松散地说,似乎正在发生的事情是它是一个字符编码不匹配。很可能您的Perl源文件已保存为UTF-8编码。但是,您尚未在脚本中启用use utf8;。因此,它将每个非ASCII德语字符解释为两个字符并相应地设置填充。但是您运行的终端也处于UTF-8模式,因此字符打印正确。尝试添加use warnings;,我打赌你会收到警告,如果添加use utf8;实际修复问题,我不会感到惊讶。

答案 1 :(得分:2)

如果您的代码点采用0或2个打印列而不是1,则不能将Unicode与printf一起使用。

您需要改为使用Unicode::GCString

错误的方式:

printf "%-10.10s", our $string;

正确的方式:

use Unicode::GCString;

my $gcstring = Unicode::GCString->new(our $string);
my $colwidth = $gcstring->columns();
if ($colwidth > 10) {
    print $gcstring->substr(0,10);
} else {
    print " " x (10 - $colwidth);
    print $gcstring;
}

答案 2 :(得分:2)

    #!/usr/bin/env perl

    use warnings;
    use strict;

    use utf8; # This is to allow utf8 in this program file (as opposed to reading/writing from/to file handles)

    binmode( STDOUT, 'utf8:' ); # Allow output of UTF8 to STDOUT

    my @strings = ( 'Mühßig', 'Holler' ); # UTF8 in this file, works because of 'use utf8'

    foreach my $s (@strings) { printf( "%-15s %10s\n", $s, 'lined up' ); } # should line up nicely

    open( FILE, 'utf8file' ) || die("Failed to open file: $! $?");

    binmode( FILE, 'utf8:' );

    # Same as above, but on the file instead of STDIN

    while(<FILE>) { chomp;printf( "%-15s %10s\n", $_, 'lined up' ); }

    close( FILE );

    # This works too
    use Encode;

    open( FILE, 'utf8file' ) || die("Failed to open file: $! $?");

    while(<FILE>) {
            chomp;
            $_ = decode_utf8( $_ );
            printf( "%-15s %10s\n", $_, 'lined up' );
    }

    close( FILE );