我希望将特定单词存储在CSV文件中

时间:2015-06-19 20:25:43

标签: regex perl

我需要文件名中的特定单词应该从目录

中的文件名中提取出来
!/usr/bin/perl -w

my $directory = "/home/grds/datafiles";
opendir(DIR, $directory) or die "couldn't open $directory: $!\n";
@files = grep("EXP", readdir(DIR));
closedir(DIR);

foreach $file (@files) {
  # print "$file\n";
        open ($file
}   

示例文件名:

EXPresult_3D0R0000002345_test345_cache1_IND0000ASD123_2014_04_12_18_56_1

我需要

3D0R0000002345, test345, cache1, IND0000ASD123, 2014_04_12

应存储在带有单独列的Excel文件中。

2 个答案:

答案 0 :(得分:1)

我刚刚使用Excel :: Writer :: XLSX编写,如果已经不可用,则需要安装此模块来运行此脚本。如果您需要更多功能,还可以在cpan上查看此模块。

#!/usr/bin/perl

use strict;
use Excel::Writer::XLSX;

my $k          = 0;
my $reportfile = "report.xlsx";
my $workbook   = Excel::Writer::XLSX->new( $reportfile );
die "Problems creating new Excel file: $!" unless defined $workbook;
my $worksheet  = $workbook->add_worksheet();

# Excel Format
my $format = $workbook->add_format();

my @val;
opendir( my $dir, "./test/" );
my @file = grep( /EXP/, readdir( $dir ) );
while ( <@file> ) {
    @val = split( /_/, $_ );
    print "$val[1] $val[2] $val[3] $val[4] $val[5]_$val[6]_$val[7]\n";
    $format->set_align( 'top' );
    $worksheet->write( $k, 0, "$val[1]",                 $format );
    $worksheet->write( $k, 1, "$val[2]",                 $format );
    $worksheet->write( $k, 2, "$val[3]",                 $format );
    $worksheet->write( $k, 3, "$val[4]",                 $format );
    $worksheet->write( $k, 4, "$val[5]_$val[6]_$val[7]", $format );
    $k++;
}

答案 1 :(得分:1)

如果你想要的只是一个CSV输出

,这是相当简单的

此程序检查每个目录项是否为文件(而不是目录),并且名称包含EXP,然后在下划线_上将其拆分为最多6个字段。这使得尾随日期时间保持原样作为单个字段

然后删除第一个字段,从最后一个字段中删除时间,并打印所有剩余字段,并以逗号,加入

我使用autodie,无需检查opendir

的成功与否
use strict;
use warnings;
use 5.010;
use autodie;

use File::Spec::Functions 'rel2abs';

use constant DIRECTORY => '/home/grds/datafiles';

opendir my $dh, DIRECTORY;

while ( my $node = readdir $dh ) {

  my $fullpath = rel2abs($node, DIRECTORY);
  next unless -f $fullpath and $node =~ /EXP/;

  my @fields = split /_/, $node, 6;
  next unless @fields == 6;

  shift @fields;
  $fields[-1] =~ s/\d+_\d+_\d+\K.*//;

  print join(',', @fields), "\n";
}

<强>输出

3D0R0000002345,test345,cache1,IND0000ASD123,2014_04_12