Question

编辑：很抱歉误会，我编辑了一些内容，希望实际上可以提出我想要的内容。

我想知道是否有办法打开/加入两个或多个文件来运行程序的其余部分。

例如，我的目录包含以下文件：

taggedchpt1_1.txt，parsedchpt1_1.txt，taggedchpt1_2.txt，parsedchpt1_2.txt等......

程序必须同时调用标记和解析。我想在chpt1_1和chpt1_2上运行程序，最好在一个.txt文件中连接在一起，除非这样做很慢。例如，运行具有两个文件的内容：

taggedchpt1_1_and_chpt1_2和parsedchpt1_1_and_chpt1_2

这可以通过Perl完成吗？或者我应该自己组合文本文件（或自动化该过程，制作chpt1.txt，包括chpt1_1，chpt1_2，chpt1_3等......）

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
print "Please type in the chapter and section NUMBERS in the form chp#_sec#:\n"; ##So the user inputs 31_3, for example
chomp (my $chapter_and_section = "chpt".<>);
print "Please type in the search word:\n";
chomp (my $search_key = <>);

open(my $tag_corpus, '<', "tagged${chapter_and_section}.txt") or die $!;
open(my $parse_corpus, '<', "parsed${chapter_and_section}.txt") or die $!;

要使程序的其余部分工作，我需要能够：

my @sentences = <$tag_corpus>; ##right now this is one file, I want to make it more
my @typeddependencies = <$parse_corpus>; ##same as above

EDIT2 ：真的很抱歉这个误会。在程序中，在显示的步骤之后，我做2 for循环。通过标记和解析的行读取。

我想要的是使用来自同一目录的更多文件来完成此操作，而无需重新输入下一个文件。（即我可以运行taggedchpt31_1.txt和parsedchpt31_1.txt ......我想运行taggedchpt31和parsedchpt31 - 其中包括~chpt31_1，~chpt31_2等...）

最终，如果我加入所有标记文件和所有具有共同章节的解析文件（最终仍然只需要运行两个文件），那么最好不要将连接文件保存到目录...现在我把它说成文字，我想我应该保存包含所有部分的文件。

抱歉，感谢您的所有时间！看看FMc对我的问题的细分以获得更多帮助。

Answer 1

您可以迭代文件名，依次打开和阅读每个文件名。或者你可以生成一个知道如何从文件序列中读取行的迭代器。

sub files_reader {
    # Takes a list of file names and returns a closure that
    # will yield lines from those files.
    my @handles = map { open(my $h, '<', $_) or die $!; $h } @_;
    return sub {
        shift @handles while @handles and eof $handles[0];
        return unless @handles;
        return readline $handles[0];
    }
}

my $reader = files_reader('foo.txt', 'bar.txt', 'quux.txt');

while (my $line = $reader->()) {
    print $line;
}

或者您可以使用Perl的内置迭代器来执行相同的操作：

local @ARGV = ('foo.txt', 'bar.txt', 'quux.txt');
while (my $line = <>) {
    print $line;
}

根据后续问题进行编辑：

也许这有助于将问题分解为更小的子任务。据我了解，你有三个步骤。

第1步是从用户那里获得一些输入 - 可能是目录名，也可能是一些文件名模式（taggedchpt和parsedchpt）。
< / LI>
第2步是让程序找到所有相关的文件名。对于此任务，glob()或readdir()可能有用。 StackOverflow上有很多与此类问题相关的问题。最终会得到两个文件名列表，一个用于标记文件，另一个用于解析文件。
第3步是处理两组中每一组中所有文件的线条。您收到的大部分答案（包括我的答案）都将帮助您完成此步骤。

Answer 2

你差不多......这比每个文件上的离散打开效率更高......

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";
print "Please type in the chapter and section NUMBERS in the for chp#_sec#:\n";
chomp (my $chapter_and_section = "chpt".<>);
print "Please type in the search word:\n";
chomp (my $search_key = <>);

open(FH, '>output.txt') or die $!;   # Open an output file for writing
foreach ("tagged${chapter_and_section}.txt", "parsed${chapter_and_section}.txt") {
    open FILE, "<$_" or die $!;      # Read a filename (from the array)
    foreach (<FILE>) {
       $_ =~ s/THIS/THAT/g;   # Regex replace each line in the open file (use 
                              #     whatever you like instead of "THIS" &
                              #     "THAT"
       print FH $_;           # Write to the output file
    }
}

Answer 3

还没有人提到@ARGV黑客呢？好的，这是。

{
    local @ARGV = ('taggedchpt1_1.txt', 'parsedchpt1_1.txt', 'taggedchpt1_2.txt',  
                   'parsedchpt1_2.txt');
    while (<ARGV>) {
       s/THIS/THAT/;
       print FH $_;
    }
}

ARGV是一个特殊的文件句柄，它遍历@ARGV中的所有文件名，关闭文件并根据需要打开下一个文件。通常@ARGV包含您传递给perl的命令行参数，但您可以将其设置为您想要的任何参数。

如何打开/加入多个文件（取决于用户输入），然后同时使用2个文件

3 个答案: