好的......我们在Exchange中有一个联系人书籍,它被导出到一个XML文件中......我们的内部网...用于我们的Associate Directory。发生了“事情”,导致一系列导致XML更新的事件。
显然,我们的Squirrel Mail服务器使用Perl脚本将此XML转换为global.abook。
我不熟悉Perl,但通用的想法似乎很容易理解:遍历XML,为每个人拉“昵称”,全名,电子邮件&标题并加入global.abook。
我确定OLD XML文件没有Root \ XSD:Schema和Root \ DataRoot布局。不确定更新的最佳格式是什么。
Perl脚本:
#!/usr/bin/perl
use strict;
use XML::Parser;
use Data::Dumper;
my $url = 'http://intranet.mycompany.org/directory/directory.xml';
my $output = '/var/lib/squirrelmail/prefs/global.gabook';
my $file = "curl -sS '$url' |";
my $parser = new XML::Parser(Style => 'Tree');
my $tree = $parser->parsefile($file)->[1];
sub extract {
my ($string, $record) = @_;
for (my $i = 0; $i < @{$record}.''; $i++) {
if ($record->[$i] eq $string) {
return $record->[$i + 1][2];
}
}
return undef;
}
open FILE, "> $output"
or die "Couldn't open: $!";
for (my $i = 4; $i < @{$tree}.''; $i += 4) {
my $record = $tree->[$i];
my $full = &extract('DisplayName', $record);
my $title = &extract('JobTitle', $record);
my $email = &extract('EMailDisplayName', $record);
next unless($email);
my $nickname;
# Nickname is the first part of the email address
if ($email =~ /^(\w+)\@/) {
$nickname = $1;
}
print FILE "$nickname|$full||$email|$title" . "\n";
}
close FILE
XML文件:
<?xml version="1.0" standalone="yes"?>
<root xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:od="urn:schemas-microsoft-com:officedata">
<xsd:schema>
...
</xsd:schema>
<dataroot xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" generated="2011-07-12T14:14:13">
<ROW>
<DisplayName>John Doe</DisplayName>
<FirstName>John</FirstName>
<LastName>Doe</LastName>
<JobTitle>I.D. 10 Technologist</JobTitle>
<Company>My Company</Company>
<Department>Administration</Department>
<FileAs>Doe, John</FileAs>
<BusinessPhone>(800) 867-5309</BusinessPhone>
<EMailAddress>jdoe@mycompany.org</EMailAddress>
<EMailAddressType>SMTP</EMailAddressType>
<EMailDisplayName>jdoe@mycompany.org</EMailDisplayName>
<Initials>J.D.</Initials>
<Private>0</Private>
</ROW>
<ROW>
...
</ROW>
</dataroot>
</root>
所需的文本文件:
jdoe|John Doe||jdoe@atlanticgeneral.org|I.D. 10 Technician
...
...
答案 0 :(得分:3)
XML :: Parser相当神秘。我使用XML :: LibXML。
#!/usr/bin/perl
use strict;
use XML::LibXML qw( );
use XML::LibXML::XPathContext qw( );
my $xml = <<'__EOI__';
<?xml version="1.0" standalone="yes"?>
<root xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:od="urn:schemas-microsoft-com:officedata">
<xsd:schema>
...
</xsd:schema>
<dataroot xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" generated="2011-07-12T14:14:13">
<ROW>
<DisplayName>John Doe</DisplayName>
<FirstName>John</FirstName>
<LastName>Doe</LastName>
<JobTitle>I.D. 10 Technologist</JobTitle>
<Company>My Company</Company>
<Department>Administration</Department>
<FileAs>Doe, John</FileAs>
<BusinessPhone>(800) 867-5309</BusinessPhone>
<EMailAddress>jdoe@mycompany.org</EMailAddress>
<EMailAddressType>SMTP</EMailAddressType>
<EMailDisplayName>jdoe@mycompany.org</EMailDisplayName>
<Initials>J.D.</Initials>
<Private>0</Private>
</ROW>
</dataroot>
</root>
__EOI__
sub get_text { map $_->textContent, @_ }
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($xml);
my $root = $doc->documentElement();
for my $row ($root->findnodes('/root/dataroot/ROW')) {
my ($name) = get_text( $row->findnodes('DisplayName') );
my ($title) = get_text( $row->findnodes('JobTitle') );
my ($email) = get_text( $row->findnodes('EMailDisplayName') );
if (!defined($name) || !defined($title) || !defined($email)) {
warn("Bad record\n");
next;
}
my ($nick) = $email =~ /^([^@]*)/;
print("$nick|$name||$email|$title\n");
}
答案 1 :(得分:2)
这是你在找什么?
use strict;
use warnings;
use XML::Simple;
use LWP::Simple;
my $url = 'http://intranet.mycompany.org/directory/directory.xml';
my $outfile = '/var/lib/squirrelmail/prefs/global.gabook';
my $xml = get( $url );
my $structure = XMLin( $xml );
open my $out_fh, '>', $outfile or die $!;
foreach my $row ( @{ $structure->{dataroot}{ROW} } ) {
next unless exists $row->{FileAs} and defined $row->{FileAs};
my( $email, $name, $title ) = map{
warn "Warning: $_ is undefined for $row->{FileAs}."
unless exists $row->{$_} and defined $row->{$_};
$row->{$_} || '';
} qw/ EMailAddress DisplayName JobTitle /;
my $nick;
if( $email =~ m/^([^@]+)@/ ) {
$nick = $1;
} else {
$nick = '';
warn "Warning: No nickname for $row->{FileAs}.";
}
print $out_fh "$nick|$name||$email|$title\n";
}
close $out_fh or die $!;
如果您的XML不是非常复杂,那么XML :: Simple是一个简单的解决方案。另外,当你可以在Perl中使用LWP :: Simple时,我并不认为需要从shell中使用curl
。如果您愿意,您可以轻松修改上述内容,使其与原始脚本的依赖关系更加接近。我对LWP :: Simple的使用可以由您的curl
替代。
在特定字段不包含任何内容或不存在的情况下,我添加了屏幕警告和默认行为。例如,如果给定行缺少EMailAddress,则会收到一些警告。但是,默认的空字符串将插入到该列位置以便进行正常恢复。如果您认为此问题非常严重,可以将warn
更改为die
。
我也在跳过任何没有定义FileAs标记的ROW,假设至少有一个标记必须存在才能使记录有效。你可以改变它的味道,但如果它不是一个有效的记录代码,我会保持某种形式的优雅'继续前进。以防万一。