Question

我是perl的新手。这是我的第二个任务我应该创建程序来解析n个文件并使用n-gram模型打印m个句子。长话短说，我写了这个脚本，它将带有n个参数，其中第一个和第二个参数是数字，但其余的是文件名，但是我收到此错误 ngram.pl第35行打印的宽字符，行1。

重现它的步骤：

从命令行输入：perl ngram.pl 5 10 tale-cities.txt bleak-house.txt papers.txt
输出：在ngram.pl第35行第1行打印宽字符。

#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Scalar::Util qw(looks_like_number);
use utf8;
use Encode;
#Charles Dickens


sub checkIfNumberic
{
 my ($inp)=@_;
    if  (looks_like_number($inp)){
       return "True";
    }
    else{
        return "False" ;
    }
}
sub main
{
    my $correctInput=", your input must be something like this 5 10 somefile.txt somefile2.txt ";
    my @inputs= @ARGV;
    if (checkIfNumberic($inputs[0]) eq "False"){
        die "first argument must be numberic $correctInput\n";
    }
    if (checkIfNumberic($inputs[1]) eq "False"){
        die "second argument must be numberic $correctInput\n";
    }
    for (my $i=2;  $i< scalar @inputs ;$i++)
    {
        if (open(my $fh, '<:encoding(UTF-8)', $inputs[$i])) {
            while (my $line = <$fh>) {
                chomp $line;
                print "$line \n";
            }
        }
    }
}

main();

Answer 1

您解码了输入（脚本使用use utf8;;文件使用:encoding(UTF-8)），但您没有对输出进行编码。添加

use open ':std', ':encoding(UTF-8)';

这相当于

BEGIN {
   binmode STDIN,  ':encoding(UTF-8)';
   binmode STDOUT, ':encoding(UTF-8)';
   binmode STDERR, ':encoding(UTF-8)';
}

它还设置了在其词法范围内打开的文件句柄的默认编码，如果需要，可以删除现有的:encoding(UTF-8)。

错误：第35行X处打印的宽字符，＆lt; $ fh＆gt; ？（从命令行读取文本文件）

1 个答案: