我想知道如何从perl中的文件中提取部分行。 我有一个日志文件,我想从中通过perl脚本提取一些有意义的信息。 我能够获得我正在寻找的整条生产线,但我只需要该生产线的一部分。
Perl脚本(我已经使用过):
#!/usr/bin/perl
use strict;
use warnings;
my $file='F:\3Np_RoboSitter\perl pgm\input.txt';
open my $fh, "<", $file or die $!;
print "************************************************************\n";
print "DC status:\n\n";
while (<$fh>) {
print if /DC messages Picked/ .. /DC messages Picked from the Queue/;
}
print "\n************************************************************\n\n";
close ($fh);
输入文件:
adfaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaadfafafafqdrareeaf
2014-02-14 00:18:04,840 1754897056 INFO ApplicationService aadfafa123 ApplicationService ApplicationServiceCustomerID ApplicationServiceSessionToken Parse of XML started. |HostName=AAAAAA|TimeStamp=2014-02-14 00:16:39.044|Message=OUT;submitApplications.SubmitApplicationBatchProcess;Total 1311 DC messages Picked from the Queue.|Detail=<XMLNSC><LogMessage><messageText>Total 1311 DC messages Picked from the Queue.</messageText></LogMessage></XMLNSC>
dafafafzcvzvsfdfafafffffffffffffffffffffffff
输出:
************************************************************
DC status:
2014-02-14 00:18:04,840 1754897056 INFO ApplicationService aadfafa123
ApplicationService ApplicationServiceCustomerID ApplicationServiceSessio
nToken Parse of XML started. |HostName=AAAAAA|TimeStamp=2014-02-14 00:16:39.0
44|Message=OUT;submitApplications.SubmitApplicationBatchProcess;Total 1311 DC me
ssages Picked from the Queue.|Detail=<XMLNSC><LogMessage><messageText>Total 1311
DC messages Picked from the Queue.</messageText></LogMessage></XMLNSC>
************************************************************
期望的输出:
2014-02-14 00:18:04
Total 1311 DC messages Picked from the Queue. *(Which is between <messagetext> tag)*
团队,请在您的空闲时刻提供您宝贵的建议!...
答案 0 :(得分:2)
它总是基于输入。您的输入格式不正确(不是固定长度,不是CSV),因此最简单的是regexp方法。
while (my $line = <$fh>){
my ($date) = split(/,/,$line,2);
if ($line =~ s!<messageText>(.+?)</messageText>!!is){
print "$date\n$1\n";
}
}