我有一个包含参考书目的长字符串,在paper-title /逗号分隔的作者列表行之间交替,如下所示:
Learning Programs: A Bayesian Approach
P. Liang, M. Jordan, D. Klein
Variational methods for a Dirichelet process
D. Blei, M. Jordan
我想要的是一个独特作者列表(按姓氏按字母顺序排列)和计数。在上面的例子中,它将是:
D. Blei (1)
M. Jordan (2)
D. Klein (1)
P. Liang (1)
有谁能告诉我如何在Perl或visual basic中执行此操作?非常感谢 - 你摇滚!
答案 0 :(得分:1)
为我工作:
#!/usr/bin/perl
use strict;
use warnings;
### collecting all the authors, using them as hash slice keys for quick count
my %author_count;
while (<DATA>) {
chomp( my $authors_line = <DATA> );
$_++ for @author_count{split /, /, $authors_line};
}
### printing the resulting hash
### sorting by substr was sufficient for test cases,
### but may be replaced by regexers, of course. )
print "$_ ($author_count{$_})", "\n"
for sort { (substr $a, 3) cmp (substr $b, 3) } keys %author_count;
__DATA__
Learning Programs: A Bayesian Approach
P. Liang, M. Jordan, D. Klein
Variational methods for a Dirichelet process
D. Blei, M. Jordan
答案 1 :(得分:1)
在perl中,您需要做的是读取输入,使每一行都成为作者行:
my %list;
while (<DATA>) {
chomp;
my $book = $_;
chomp(my $authors = <DATA>);
map { push @{$list{$_}}, $book } split /,\s*/, $authors;
}
for (sort { sortA($a) cmp sortA($b) } keys %list) {
printf "$_ (%s)\n", scalar @{$list{$_}};
}
sub sortA {
if ($_[0] =~ / (\w+)/) {
return $1;
}
}
__DATA__
Learning Programs: A Bayesian Approach
P. Liang, M. Jordan, D. Klein
Variational methods for a Dirichelet process
D. Blei, M. Jordan