Question

我正在进行大型（8000行）库数据转换。我在一个文件中读到并希望逐行修改它。读取文件在到达文件末尾之前停止。

open(my $infh, "<", 'infile.pica') 
        || die('Could not open pica file'); 
open(my $outfh, ">", 'infile.pica.norm') 
        || die('Could not open pica file');

my $counter = 0;

while (my $line = <$infh>) {

    $counter++;
    # for debugging - this is the last line being read. 
    # infile actually has 7857 lines

    if ($counter >= 7691) {
        say $line;
    }
    # modification commented out for debugging
    print $outfh $line;
}   
close $infh;
close $outfh;

我的第一个念头是这条线上有一个奇怪的角色，但没有什么

006X $cEBC$03564211 (original)
006X $cEBC$035642 (being read, thats what the say prints)

以下是数据集的片段，它停止阅读：

002@ $0Oax
002C $aText$btxt
002D $aComputermedien$bc
002E $aOnline-Ressource$bcr
004A $09780309160193
006X $cEBC$03564211
010@ $aeng
011@ $a2010

您可以看到每一行后面都有一个换行符（嘿代码0A）。 006X线是它停止读取的地方。

Answer 1

以下是解决方案：问题与我发布的脚本无关，甚至与文件读取无关。在我没有发布的脚本中，我忘记通过调用以下来刷新输出：

PICA::Writer->end()

请参阅PICA::Writer->end：

完成写作。根据格式和输出处理程序，编写页脚（例如XML结束标记）并关闭输出处理程序。之后状态设置为PICA::Writer::ENDED。如果之前没有启动过编写器，则首先调用start方法。

结束或写入已经结束的作家会抛出错误。您可以使用output方法或start方法重新启动已结束的writer。

我的脚本在读取文件时错过了行

1 个答案: