避免空文件创建

时间:2014-10-31 08:33:15

标签: perl

我希望通过在regex的帮助下在特定行拆分来将大文件拆分为小文件。有帮助吗? 我的代码正在完成这项工作,但它也创建了一个空文件。

#!/usr/local/lib/perl/5.14.2

open( INFILE, 'test.txt' );
@lines = <INFILE>;
$file  = "outfile";
for ( $j = 0; $j <= $#lines; $j++ ) {
    open( OUTFILE, ">", $file . $j );
    $file_name = $file . $j;
    #print "file is $file_name\n";
    $i = 0;
    while (@lines) {
        $_ = shift @lines;
        chomp;
        $i++;
        if ( $_ =~ /^###\s*(.*)\s*###/ && $i > 1 ) {
            unshift @lines, "$_\n";
            print "$filename\n";
            last;
        }
        print OUTFILE "$_\n";
    }
    close(OUTFILE);
}
close(INFILE);

我的输入文件包含:

------------- 
### abcd hdkjfkdj #### 
body 1 dsjklsjdfskl 
### zyz fhid ### 
abcdksdsd djnfkldsfmnsldk ;lkjfkl 
--------------------------- 

它创建了3个名为outfile0outfile1outfile2的文件。但是outfile0是空的我想避免这种情况。

2 个答案:

答案 0 :(得分:3)

修复它的方法是仅在响应找到的行时打开文件。您的程序将打开一个新文件,这就是为什么它有一个空的输出文件

这是一个有效的重写。我还删除了临时@lines数组

#!/usr/bin/perl
#
use warnings;
use strict;

open(my $file,"<", "test.txt") || die $!;
my $counter=1;
my $out;

while(<$file>) {
  if (/###\s*(.*)\s*###/) { 
    open($out, ">", "outfile$counter") || warn "outfile$counter $!";
    $counter++;
  }
  print $out $_ if $out;
}

答案 1 :(得分:0)

如果要将###块之间的材料用作文件标题,则可以在使用###块的行上进行模式匹配时设置文件名。 / p>

#!/usr/bin/perl
use strict;
use warnings;

open my $fh, '<', 'my_file.txt' or die "Could not open file: $!";

# initialise a variable that will hold the output file handle
my $out;
while (<$fh>) {
    # capture the title between the # signs
    if (/##+ (.*?) ##+/) {
        open $out, '>', $1.".txt" or die "Could not create file $1.txt: $!";
    }
    elsif ($out) {
        print $out $_;
    }
    else {
        # if $out is not set, we haven't yet encountered a title block
        warn "Error: line found with no title block: $_";
    }
}

示例输入:

Text files containing their own name
### questions-1 ####
Why are a motorcycle's front brakes more effective than back?
Is it possible to make a gradient follow a path in Illustrator?
Text files containing their own name
### questions-2 ###
Why does Yoda mourn the Jedi after order 66 is executed?
what are the standard gui elements called?
Flybe just cancelled my return flight. Will they refund that part of the trip?
### questions-3 ###
Merge two arrays of ElementModels?
Is this set open or closed?

输出:包含相应行的三个文件questions-1.txtquestions-2.txtquestions-3.txt。例如问题 - 的1.txt:

Why are a motorcycle's front brakes more effective than back?
Is it possible to make a gradient follow a path in Illustrator?
Text files containing their own name

您还没有说明您是否想要输出中的###行,所以我将它们关闭了。

根据您所使用的操作系统以及您的潜在文件名包含的内容,您可能需要过滤它们并用下划线替换特殊字符(或者只删除特殊字符)。