我试图用冒号分隔
来拆分这些值我的意见:
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"
我正在使用此代码块
while ( my $line = <IN> ) {
chomp $line;
print "$line\n";
my @values = split( /\s+/, $line );
foreach $data (@values) {
chomp $data;
( $key, $value ) = split( /=/, $data );
$key =~ s/\s+//g;
$key =~ s/"//g;
}
}
我收到这个输出,它取值之间的空格,如何从上面的输入中精确地分割键和值
_1;
Linux
x86_64;
rv:23.0)
Gecko/20100101es,OU
(X1
先谢谢
答案 0 :(得分:1)
假设"
不会显示为有效值字符,
my %hash;
while (my $line = <IN>)
{
$hash{$1} = ($2 // $3) while $line =~ /(\w+)=(?: "(.+?)" | (\S+) )/xg;
}
答案 1 :(得分:0)
此解决方案使用了perl 5.10中引入的(?|)
匹配组(我认为)。如果您不想保存为哈希,可以使用while
循环扩展该行。在while
内,密钥位于$1
,值位于$2
。
#!/usr/bin/env perl
use warnings;
use strict;
use 5.01;
while (<DATA>){
chomp;
my %header;
$header{$1} = $2 while (/\G\s*(\S+)=(?|"([^"]*)"|(\S*))/g); #extend here
printf "%9s => %s\n", $_, $header{$_} for keys %header;
}
__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"
打印:
message => Authentication success
user_agent => Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0
request_id => bbfd6a1f-90c4-45g52-9e7c-db5
如果引用变得更复杂,您应该使用Text::Balanced
例程查看extract_quotelike
。
答案 2 :(得分:0)
您可以使用perlretut - Alternative capture group numbering将值捕获为封闭引号或非空格。
然后因为捕获组按键值对排列,所以可以像这样直接初始化哈希:
use strict;
use warnings;
while (<DATA>) {
chomp;
my %hash = /\G([^=]+)=(?|"([^"]*)"|(\S*))\s*/g;
use Data::Dump;
dd \%hash;
}
__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"
输出:
{
message => "Authentication success",
request_id => "bbfd6a1f-90c4-45g52-9e7c-db5",
user_agent => "Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0",
}