如何解析具有以下内容的XML?
<?xml version="1.0"?>
<saw:ibot xmlns:saw="com.siebel.analytics.web/report/v1" version="1" priority="normal" jobID="36 ">
<saw:schedule timeZoneId="(GMT-05:00) Eastern Time (US & Canada)" disabled="false">
<saw:start repeatMinuteInterval="60" endTime="23:59:00" startImmediately="true"/>
<saw:recurrence runOnce="false">
<saw:weekly weekInterval="1" mon="true" tue="true" wed="true" thu="true" fri="true"/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility type="recipient" runAs="cgm"/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arriv al_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
...skipping...
al_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard"/>
<saw:destination category="activeDeliveryProfile"/>
</saw:deliveryDestinations>
<saw:recipients subscribers="true" customize="false" specificRecipients="false">
<saw:subscribers>
<saw:user name="mbussey@xyz.com"/>
<saw:user name="kimmy.chan@pqr.com"/>
<saw:user name="chudgins@gmail.com"/>
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"/>
</saw:conditionQuery>
</saw:ibot>
并检索以下输出?
mbussey@xyz.com
kimmy.chan@pqr.com
chudgins@gmail.com
此外,我还有5个.xml文件,其中包含不同的解析名称值。无论如何,我们可以在命令行中解析并合并它们并输出到一个文件中吗?
我尝试了sed
和awk
选项,但没有帮助我获得所需的输出。
答案 0 :(得分:3)
此命令将解析XML文档并使用XPath为位置name
/saw:ibot/saw:recipients/saw:subscribers/saw:user
属性值
xmlstarlet sel -t -v '/saw:ibot/saw:recipients/saw:subscribers/saw:user/@name' </tmp/xml
输出
mbussey@xyz.com
kimmy.chan@pqr.com
chudgins@gmail.com
答案 1 :(得分:1)
使用XML Parser。就个人而言 - 就像XML::Twig
和perl
一样。
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new( );
$twig->parsefile ( 'your_file.xml' );
foreach my $saw_user ( $twig->get_xpath('//saw:user') ) {
print $saw_user ->att('name'), "\n";
}
打印:
mbussey@xyz.com
kimmy.chan@pqr.com
chudgins@gmail.com
如果你想要一个衬垫&#39;然后改为:
perl -MXML::Twig -0777 -e 'print map { $_ -> att('name')."\n"} ( XML::Twig->parse( <> )->get_xpath('//saw:user') )' your_xml_file
请为了将来的维护程序员和系统管理员而使用 - 请勿使用正则表达式来解析XML。为什么你会问?好吧,因为以XML为例 - 它看起来像任何一个并且在语义上仍然相同:
(你的例子+
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1">
<saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)">
<saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/>
<saw:recurrence runOnce="false">
<saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility
runAs="cgm"
type="recipient"
/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard" />
<saw:destination category="activeDeliveryProfile" />
</saw:deliveryDestinations>
<saw:recipients
customize="false"
specificRecipients="false"
subscribers="true">
<saw:subscribers>
<saw:user name="mbussey@xyz.com" />
<saw:user name="kimmy.chan@pqr.com" />
<saw:user name="chudgins@gmail.com" />
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content" />
</saw:conditionQuery>
</saw:ibot>
或者像这样(注意元素的标记包装)
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot jobID="36" priority="normal" version="1" xmlns:saw="com.siebel.analytics.web/report/v1">
<saw:schedule disabled="false" timeZoneId="(GMT-05:00) Eastern Time (US & Canada)">
<saw:start endTime="23:59:00" repeatMinuteInterval="60" startImmediately="true"/>
<saw:recurrence runOnce="false">
<saw:weekly fri="true" mon="true" thu="true" tue="true" wed="true" weekInterval="1"/>
</saw:recurrence>
</saw:schedule>
<saw:dataVisibility runAs="cgm" type="recipient"/>
<saw:choose>
<saw:when condition="true">
<saw:deliveryContent>
<saw:headline>
<saw:caption>
<saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text>
</saw:caption>
</saw:headline>
<saw:conditionalReport/>
</saw:deliveryContent>
<saw:postActions/>
</saw:when>
<saw:otherwise/>
</saw:choose>
<saw:deliveryDestinations>
<saw:destination category="dashboard"/>
<saw:destination category="activeDeliveryProfile"/>
</saw:deliveryDestinations>
<saw:recipients customize="false" specificRecipients="false" subscribers="true">
<saw:subscribers>
<saw:user name="mbussey@xyz.com"/>
<saw:user name="kimmy.chan@pqr.com"/>
<saw:user name="chudgins@gmail.com"/>
</saw:subscribers>
</saw:recipients>
<saw:conditionQuery>
<saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"/>
</saw:conditionQuery>
</saw:ibot>
或者像这样:
<?xml version="1.0" encoding="utf-8"?>
<saw:ibot
jobID="36"
priority="normal"
version="1"
xmlns:saw="com.siebel.analytics.web/report/v1"
><saw:schedule
disabled="false"
timeZoneId="(GMT-05:00) Eastern Time (US & Canada)"
><saw:start
endTime="23:59:00"
repeatMinuteInterval="60"
startImmediately="true"
/><saw:recurrence
runOnce="false"
><saw:weekly
fri="true"
mon="true"
thu="true"
tue="true"
wed="true"
weekInterval="1"
/></saw:recurrence></saw:schedule><saw:dataVisibility
runAs="cgm"
type="recipient"
/><saw:choose
><saw:when
condition="true"
><saw:deliveryContent
><saw:headline
><saw:caption
><saw:text
>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text></saw:caption></saw:headline><saw:conditionalReport
/></saw:deliveryContent><saw:postActions
/></saw:when><saw:otherwise
/></saw:choose><saw:deliveryDestinations
><saw:destination
category="dashboard"
/><saw:destination
category="activeDeliveryProfile"
/></saw:deliveryDestinations><saw:recipients
customize="false"
specificRecipients="false"
subscribers="true"
><saw:subscribers
><saw:user
name="mbussey@xyz.com"
/><saw:user
name="kimmy.chan@pqr.com"
/><saw:user
name="chudgins@gmail.com"
/></saw:subscribers></saw:recipients><saw:conditionQuery
><saw:reportRefNode
path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"
/></saw:conditionQuery></saw:ibot>
希望通过查看这些示例,您会看到通过以完美有效的方式重新格式化XML,您的正则表达式可能有一天会神秘地破解。