我需要将ACARS消息解析为XML格式。
有简单的信息:
RX_IDX: 13
ACARS mode: O, message label: 5V
ACARS ML description: VDL switch advisory
Aircraft reg: .EI-EUX, flight id: UN0323
Block id: 57, msg. no: S91A
Message content:-
----------------------------------------------------------[05/05/2013 08:58]
RX_IDX: 14
ACARS mode: 2, message label: 1L
ACARS ML description: Off message
Aircraft reg: .D-AIRO, flight id: LH1490
Aircraft vendor: Airbus, short type: A321, full type: A321-131, cn: 0563
Carrier IATA: LH, ICAO: DLH, remarks: Lufthansa
Airlines: Lufthansa
Block id: 56, msg. no: M03A
Message content:-
00002216743GO,X,55655
----------------------------------------------------------[05/05/2013 09:24]
每封邮件以RX_IDX开头,以日期结束(例如[05/05/2013 09:24])。
我找到了perl脚本,但它不能识别逗号后的属性。
#!/usr/local/bin/perl
use strict;
use warnings;
my @keys = (
'RX_IDX',
'ACARS mode',
'message label',
'ACARS ML description',
'Aircraft reg',
'flight id',
'Aircraft vendor',
'short type',
'full type',
'cn',
'Carrier IATA',
'ICAO',
'remarks',
'Airlines',
'Block id',
'msg. no',
'Message content'
);
my( %keys, %tags );
$keys{$_} = 1 for @keys;
$tags{$_} = $_ . '' for @keys;
$tags{$_} =~ s/ /_/g for @keys;
my $file = 'data8.txt';
open( my $fh, '<', $file) or die("Can't open $file: $!");
my %record = map { $_, '' } @keys;
while( my $line = <$fh> ) {
chomp($line);
if( $line =~ m{ \A (.+?) : \s* (\S+) }x ) {
$record{$1} = $2 if $keys{$1};
if( $1 eq $keys[$#keys] ) {
print "<Message>\n";
print "<$tags{$_}>$record{$_}</$tags{$_}>\n" for @keys;
print "</Message>\n";
%record = map { $_, '' } @keys;
}
}
}
此致
答案 0 :(得分:0)
问题是正则表达式的if
条件仅对每一行匹配一次。尝试匹配正则表达式,直到while循环失败。我在下一个循环中添加了\G
断言,它将在上次离开时开始。也改变了一点,以避免在行的开头匹配(\A
)并在末尾添加逗号的可能匹配,它将是这样的(我只复制代码的相关部分):
while( my $line = <$fh> ) {
chomp($line);
while ( $line =~ m{ \G \s* (.+?) \s* : \s* ([^,]+) \s* (?:,|$) }xg ) {
$record{$1} = $2 if $keys{$1};
if( $1 eq $keys[$#keys] ) {
...
}
}
}