我想编写一个脚本,它将从平面文件中获取数据并将其写入excel。我的代码在
之下#!/usr/bin/perl
use strict;
use warnings;
use Spreadsheet::WriteExcel;
my $workbook = Spreadsheet::WriteExcel->new( 'deep.xls' );
my $worksheet = $workbook->add_worksheet();
$worksheet->write( 0, 0, "DEEP" ) ;
$worksheet->write( 0, 1, "RIJU" );
$worksheet->write( 1, 0, "Sukhi" );
$worksheet->write( 1, 1, "Abhilash" );
$workbook->close;
我的平面文件包含在下面
FILE_NAME Start_Timestamp End_Timestamp Record Count Inbound/Outbound
OmahaTran.txt 1/25/2018 3:40 1/25/2018 3:40 90390 Inbound
concord 1/24/2018 20:50 1/24/2018 20:50 8631 Inbound
iDine:RewardsNetwork 5220 1/24/2018 12:01 1/24/2018 12:04 218985 Outbound
nashville 1/25/2018 4:30 1/25/2018 4:32 6810 Inbound
nstrans0.20180125 1/25/2018 2:00 1/25/2018 2:00 124573 Inbound
由于我是perl的新用户,任何人都可以帮我解决如何从文本文件中检索“FILE_NAME”“End_Timestamp”和“Record Count”列并将其写入excel
答案 0 :(得分:1)
您可以将输入解析为固定宽度的文件。一旦有了字段,你就已经知道如何编写excel了......
<强> parse_fixed.pl 强>
#!/usr/bin/env perl
use warnings;
use strict;
my $usage = "usage: $0 file\n";
my $file = $ARGV[0] or die $usage;
-f $file or die $usage;
# Create $workbook and $worksheet objects here.
open my $fh, "<$file" or die "Unable to open '$file' : $!";
while(my $line = <$fh>) {
chomp($line);
# Unpack the fields, first field 27 chars, then 19 chars, etc.
# perldoc -f pack
my @fields = unpack("A27 A19 A17 A16 A20", $line);
# Remove leading and trailing whitespace for each field
# perldoc -f map
# perldoc perlretut
my ($file_name, $start, $stop, $record_count, $direction)
= map { s|^\s*||; s|\s*||; $_ } @fields;
print("filename: '$file_name', start: '$start', stop: '$stop', record_count: '$record_count', direction: '$direction'\n");
# Add $worksheet->write(...) lines for each field here.
}
# Close $workbook here.
<强>输出强>
perl parse_fixed.pl input
filename: 'FILE_NAME', start: 'Start_Timestamp', stop: 'End_Timestamp', record_count: 'Record Count', direction: 'Inbound/Outbound'
filename: 'OmahaTran.txt', start: '1/25/2018 3:40', stop: '1/25/2018 3:40', record_count: '90390', direction: 'Inbound'
filename: 'concord', start: '1/24/2018 20:50', stop: '1/24/2018 20:50', record_count: '8631', direction: 'Inbound'
filename: 'iDine:RewardsNetwork 5220', start: '1/24/2018 12:01', stop: '1/24/2018 12:04', record_count: '218985', direction: 'Outbound'
filename: 'nashville', start: '1/25/2018 4:30', stop: '1/25/2018 4:32', record_count: '6810', direction: 'Inbound'
filename: 'nstrans0.20180125', start: '1/25/2018 2:00', stop: '1/25/2018 2:00', record_count: '124573', direction: 'Inbound'
答案 1 :(得分:1)
这是一种模式,我用它来将固定宽度字段转换为逗号分隔值。当然,Excel会很乐意导入这些CSV数据,为您完成大部分工作
它假定字段从一个标题字符串的开头延伸到下一个标题字符串的开头,并使用内置数组@-
来确定每个字符串的开始位置。标题字符串可能包含单个空格;多个连续的空格终止字符串
我希望很明显,$template
的值仅打印用于诊断,并且不是CSV数据的一部分
删除print
语句是一件简单的事情,如果他们不想要,则会输出以逗号分隔的标题字符串。或者,如果需要,在导入后从电子表格中删除行也是微不足道的
DATA
文件句柄用于方便和演示目的。通常,您可能希望open
一个特定文件并使用该文件句柄,或者只使用<>
来读取指定为命令行参数的文件
use strict;
use warnings 'all';
use feature 'say';
my $head;
my $template = do {
$head = <DATA>;
my @template;
my $prev;
while ( $head =~ / \S+ (?: [ ] \S+ )* /xg ) {
push @template, defined $prev ? 'A' . ( $-[0] - $prev ) : '@' . $-[0];
$prev = $-[0];
}
push @template, 'A*';
"@template";
};
say qq{Pack format "$template"\n};
say join ',', unpack $template, $head;
while ( <DATA> ) {
say join ',', unpack $template, $_;
}
__DATA__
FILE_NAME Start_Timestamp End_Timestamp Record Count Inbound/Outbound
OmahaTran.txt 1/25/2018 3:40 1/25/2018 3:40 90390 Inbound
concord 1/24/2018 20:50 1/24/2018 20:50 8631 Inbound
iDine:RewardsNetwork 5220 1/24/2018 12:01 1/24/2018 12:04 218985 Outbound
nashville 1/25/2018 4:30 1/25/2018 4:32 6810 Inbound
nstrans0.20180125 1/25/2018 2:00 1/25/2018 2:00 124573 Inbound
Pack format "@0 A28 A20 A16 A16 A*"
FILE_NAME,Start_Timestamp,End_Timestamp,Record Count,Inbound/Outbound
OmahaTran.txt,1/25/2018 3:40,1/25/2018 3:40,90390,Inbound
concord,1/24/2018 20:50,1/24/2018 20:50,8631,Inbound
iDine:RewardsNetwork 5220,1/24/2018 12:01,1/24/2018 12:04,218985,Outbound
nashville,1/25/2018 4:30,1/25/2018 4:32,6810,Inbound
nstrans0.20180125,1/25/2018 2:00,1/25/2018 2:00,124573,Inbound