我正在解析一个文件,其中一部分是记录内容,格式如下:
CategoryA--
5: UserA
6: UserB
7: UserC
CategoryB--
4: UserA
5: UserB
我想把它移到一个看起来像这样的哈希:
{ UserA => { CategoryA => 5, CategoryB => 4, },
UserB => { CategoryA => 6, CategoryB => 5, },
UserC => { CategoryA => 7, },
}
我如何对此进行正则表达式?
编辑:它不一定纯粹只是正则表达式 - 只是在perl和循环中也会很好。
答案 0 :(得分:5)
您需要两个正则表达式,一个用于标识新类别,另一个用于解析用户记录。
#!/usr/bin/perl
use strict;
use warnings;
my %users;
my $cur;
while (<DATA>) {
if (my ($category) = /^(.*)--$/) {
$cur = $category;
next;
}
next unless my ($id, $user) = /([0-9]+): (\w+)/;
die "no category found" unless defined $cur;
$users{$user}{$cur} = $id;
}
use Data::Dumper;
print Dumper \%users;
__DATA__
CategoryA--
5: UserA
6: UserB
7: UserC
CategoryB--
4: UserA
5: UserB
或者,如果你有Perl 5.10或更高版本,你可以使用带有一个正则表达式的命名捕获:
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
my %users;
my $cur;
while (<DATA>) {
next unless /^(?:(?<category>.*)--|(?<id>[0-9]+): (?<user>\w+))$/;
if (exists $+{category}) {
$cur = $+{category};
next;
}
die "no category found" unless defined $cur;
$users{$+{user}}{$cur} = $+{id};
}
use Data::Dumper;
print Dumper \%users;
__DATA__
CategoryA--
5: UserA
6: UserB
7: UserC
CategoryB--
4: UserA
5: UserB
答案 1 :(得分:3)
这个perl代码似乎可以满足您的需求(主要是一次更改)。我对数据结构略有不同,但并不多。
#!/usr/bin/perl
use strict;
my @array = (
"CategoryA--",
"5: UserA",
"6: UserB",
"7: UserC",
"CategoryB--",
"4: UserA",
"5: UserB"
);
my ($dataFileContents, $currentCategory);
for (@array) {
$currentCategory = $1 if (/(Category[A-Z])--/);
if (/(\d+): (User[A-Z])/) {
$dataFileContents->{$2}->{$currentCategory} = $1
}
}
答案 2 :(得分:1)
不完全是在这里打高尔夫球,但可以在一次交替中完成:
my ( %data, $category );
while ( <DATA> ) {
next unless /^(?:(Category\w+)|(\d+):\s*(User\w+))/;
( $1 ? $category = $1 : 0 ) or $data{$3}{$category} = $2;
}
Data::Dumper
(实际上Smart::Comments)显示输出:
{
UserA => {
CategoryA => '5',
CategoryB => '4'
},
UserB => {
CategoryA => '6',
CategoryB => '5'
},
UserC => {
CategoryA => '7'
}
}
答案 3 :(得分:0)
这会为你分开。
prompt> ruby e.rb
[["CategoryA--", nil, nil], [nil, "5", "UserA"], [nil, "6", "UserB"], [nil, "7", "UserC"], ["CategoryB--", nil, nil], [nil, "4", "UserA"], [nil, "5", "UserB"]]
prompt> cat e.rb
s = <<TXT
CategoryA--
5: UserA
6: UserB
7: UserC
CategoryB--
4: UserA
5: UserB
TXT
p s.scan(/(^.*--$)|(\d+): (.*$)/)
prompt>
答案 4 :(得分:0)
#!/usr/bin/perl
use strict;
use Data::Dumper;
print "Content-type: text/html\n\n";
my ($x,%data);
do {
if (/^(Category\w+)/) {
$x=$1;
} elsif (/^([0-9]+):\s*(User\w)/) {
if (!defined($data{$2})) {
$data{$2} = {$x,int($1)};
} else {
$data{$2}{$x} = int($1);
}
}
} while (<DATA>);
print Dumper \%data;
__DATA__
CategoryA--
5: UserA
6: UserB
7: UserC
CategoryB--
4: UserA
5: UserB
结果:
$VAR1 = {
'UserC' => {
'CategoryA' => 7
},
'UserA' => {
'CategoryA' => 5,
'CategoryB' => 4
},
'UserB' => {
'CategoryA' => 6,
'CategoryB' => 5
}
};