输入如下文本文件。我叫$rlseHistRepo
。
Route: TUCSON-AZ
Author: upham
Date: 2018-06-07 20:09:17 UTC
Release:0.0
Content:
Full Release
Comment:
Initial setup
*** Modified on Mon Jun 11 19:18:40 PDT 2018 by upham ***
QRC Acceptor: Admin
Log: http://universityofarizona/ECE101/rev0.0_060718_130854-4307-1528769914.qclog
Successful
Status: {Objects succeeded (1)} {}
--------------------------------------------------
Route: YUMA-AZ
Author: upham
Date: 2018-06-07 20:09:18 UTC
Release:0.0
Content:
Full Release
Comment:
Initial setup
*** Modified on Tue Sep 25 15:40:02 PDT 2018 by upham ***
QRC Acceptor: Admin
Log: http://universityofarizona/ECE101/rev0.0_060718_130854-4307-1537915198.qclog
Successful
Status: {Objects succeeded (33)} {}
--------------------------------------------------
我想编写一个perl脚本来解析上面的输入文件并输出到一个csv文件,但是我遇到了哈希和数组的问题,而我缺乏处理数组中数据的知识。 这里的关键是要 这是查找行的开头 路线: 作者: 日期: 发布: 日志: 状态: 内容: 注释: 信息,然后获取字符串,然后写出到csv文件
这是我的开始脚本,我正在努力获取正确的csv打印输出数组。感谢您的帮助以更正它,并指出未按顺序正确打印输出的位置和原因。 预先非常感谢
#!/usr/bin/perl
$rlseHistRepo = $ARGV[0];
my %menu;
open(IN, "< $rlseHistRepo" ) || die "cannot read input file: $!\n";
open(OUTCSV , "> rlseLoggingRepo.csv" ) || die "cannot write output file: $!\n";
print OUTCSV "Site,Author,Release,Date,Version,Changes,Comment\n";
print OUTCSV ",,,,,,,\n";
while(<IN> ) {
my $line = $_;
chomp($line);
if( $line =~ m/^Route:/) {
my ($item, $rlsSite) = split(/\s+/, $line);
$menu{$item} = $rlsSite;
}
if( $line =~ m/^Author:/) {
my ($item, $rlsAuthor) = split(/\s+/, $line);
$menu{$item} = $rlsAuthor;
}
}
close(IN);
foreach $item ( keys %menu ) {
print OUTCSV "$menu{$item},,,,,\n";
print "$rlsSite{$item},$rlsAuthor{$item},,,,\n";
}
close(OUTCSV);
答案 0 :(得分:0)
由于您尚未指定输出的实际外观,因此我暗中摸了一下,通过查看输入数据和正则表达式来猜测。
有关生产质量代码,请遵循@Grinnz的建议,改用Text::CSV。
#!/usr/bin/perl
use strict;
use warnings;
print "Entry,Site,Author,Release,Date,Version,Changes,Comment\n";
my @entries;
while(<DATA> ) {
chomp;
if (my($site) = /^Route:\s+(.+)$/) {
# start of new entry
push(@entries, {
site => $site,
});
} elsif (my($author) = /^Author:\s+(.+)$/) {
$entries[-1]->{author} = $author;
}
}
foreach my $index (0..$#entries) {
my $entry = $entries[$index];
print "$index,$entry->{site},$entry->{author},,,,,\n";
}
__DATA__
Route: TUCSON-AZ
Author: upham
Date: 2018-06-07 20:09:17 UTC
Release:0.0
Content:
Full Release
Comment:
Initial setup
*** Modified on Mon Jun 11 19:18:40 PDT 2018 by upham ***
QRC Acceptor: Admin
Log: http://universityofarizona/ECE101/rev0.0_060718_130854-4307-1528769914.qclog
Successful
Status: {Objects succeeded (1)} {}
--------------------------------------------------
Route: YUMA-AZ
Author: upham
Date: 2018-06-07 20:09:18 UTC
Release:0.0
Content:
Full Release
Comment:
Initial setup
*** Modified on Tue Sep 25 15:40:02 PDT 2018 by upham ***
QRC Acceptor: Admin
Log: http://universityofarizona/ECE101/rev0.0_060718_130854-4307-1537915198.qclog
Successful
Status: {Objects succeeded (33)} {}
--------------------------------------------------
示例运行:
$ perl dummy.pl
Entry,Site,Author,Release,Date,Version,Changes,Comment
0,TUCSON-AZ,upham,,,,,
1,YUMA-AZ,upham,,,,,
编辑:一种替代方法是使用
if (/^Route:/../^----------/) {
# we are inside a log entry...
}
然后检测
my($keyword, $data) = /^(\w+):\s*(.*)$/;
的关键字行my($line) = /^\s+(.+)$/;
的文本行在该区块内。
答案 1 :(得分:0)
步骤1:添加use strict
和use warnings
。这会引发有关未声明变量的错误。
步骤2:添加my
来声明$rlseHistRepo
。还添加my (%rlsSite, %rlsAuthor)
来声明最终循环中使用的两个哈希。但这很奇怪,因为您正在从这些哈希中读取值,而没有在其中每个存储数据。这给了我们一些“未初始化的值”错误。所以我认为我们需要重新考虑一下。
这个想法是为每个记录建立一个哈希。当记录结束时(当我们得到破折号时),我们将输出该记录。像这样:
my @keys = qw[Route Author Date Release Log
Status Content Comment];
my %record;
while(<IN> ) {
chomp;
if (/-----/) {
say OUTCSV join ',', @record{@keys};
%record = ();
}
# ignore lines without a ':'
next unless /:/;
# ignore the '***' lines
next if /\*\*\*/;
my ($key, $value) = split /\s*:\s*/, $_, 2);
# Some keys have their values on the next line
if ($value !~ /\S/) {
chomp($value = <IN>);
$value =~ s/^\s+//;
}
$record{$key} = $value;
}
第3步:通过删除一些不必要的变量并将其放入Unix过滤器(从STDIN
读取并写入STDOUT
)中进行一些整理-这实际上更容易编写并且可以您的程序更加灵活。
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my @keys = qw[Route Author Date Release Log
Status Content Comment];
say "Site,Author,Release,Date,Version,Changes,Comment";
say ",,,,,,,";
my %record;
while (<>) {
chomp;
if (/-----/) {
say join ',', @record{@keys};
%record = ();
}
# ignore lines without a ':'
next unless /:/;
# ignore the '***' lines
next if /\*\*\*/;
if (my ($key, $value) = split /\s*:\s*/, $_, 2) {
# Some keys have their values on the next line
if ($value !~ /\S/) {
chomp($value = <>);
$value =~ s/^\s+//;
}
$record{$key} = $value;
}
}
正如其他人所提到的,在生产代码中,您想使用Text::CSV来产生输出。