Question

假设我的目录中有很多文本文件（原始文本）。我需要的是一个 Perl脚本 ，它将逐个解析目录（up2bottom）文本文件并将其内容保存在由我指定的新单个文件中。换句话说，我只想创建许多文档的语料库。注意：这些文件必须用某个标签分隔，例如表示解析它们的顺序。

到目前为止，我已经设法遵循一些示例，我知道如何读取，编写和解析文本文件。但我还没有能够将它们合并到一个脚本中并处理许多文本文件。你能帮忙吗？感谢

修改写入文件的示例代码。

#!/usr/local/bin/perl
 open (MYFILE, '>>data.txt');
 print MYFILE "text\n";
 close (MYFILE);

用于读取文件的示例代码。

#!/usr/local/bin/perl
 open (MYFILE, 'data.txt');
 while (<MYFILE>) {
    chomp;
    print "$_\n";
 }
 close (MYFILE);

我也发现了 foreach 函数，它可以用于任务本身，但仍然不知道如何组合它们并实现描述中解释的结果。

Answer 1

这个建议的重点是：

“魔术”钻石操作员（a.k.a。readline），从*ARGV中的每个文件中读取，
eof函数，它告诉当前文件句柄的下一个readline是否会返回任何数据
$ARGV变量，包含当前打开文件的名称。

有了这个介绍，我们走吧！

#!/usr/bin/perl

use strict; # Always!
use warnings; # Always!

my $header = 1; # Flag to tell us to print the header
while (<>) { # read a line from a file
    if ($header) {
        # This is the first line, print the name of the file
        print "========= $ARGV ========\n";
        # reset the flag to a false value
        $header = undef;
    }
    # Print out what we just read in
    print;
}
continue { # This happens before the next iteration of the loop
    # Check if we finished the previous file
    $header = 1 if eof;
}

要使用它，只需执行：perl concat.pl *.txt > compiled.TXT

Perl脚本 - 多个文本文件解析和写入

1 个答案: