sed解析不存在的值似乎行为不一致

时间:2010-11-02 09:33:11

标签: parsing shell sed

我的文件中包含以下几行:

bash$ cat blah.txt
<smsDeliveryStatus value="Provider Malfunction"/>
<smsDeliveryStatus value="Provider Malfunction" id="23434"/>
<smsDeliveryStatus value="Delivery Failure"/>
<smsDeliveryStatus value="Delivery Successful" id="2"/>
bash$

我想从文件中为每一行提取值和id,并且其中值或id不存在我想打印未知。我写了下面的代码,似乎在将id设置为unknown时有些失败,有些时候失败了:

bash$ cat blah.txt | sed -nr "/smsDeliveryStatus /{h; /value/ {s/.*value=\"([^\"]*)?\".*/value: \1/}; /value/! {s/.*/value: Unknown/}; p; x; /id/ {s/.*id=\"([^\"]+)\".*/id: \1/g}; /id/! {s/.*/id: Unknown/g}; p}"

这会从上面的文件中得到以下结果:

value: Provider Malfunction
<smsDeliveryStatus value="Provider Malfunction"/>
value: Provider Malfunction
id: 23434
value: Delivery Failure
id: Unknown
value: Delivery Successful
id: 2

奇怪的是,缺少id的第一行将完整打印出来,而id缺失的第二行将id设置为未知的预期。任何人都可以解释为什么会这样吗?第一次/ id /有什么区别?是第二次阅读?

A

1 个答案:

答案 0 :(得分:0)

我在文件中添加了多行:

bash$ cat blah.txt
<smsDeliveryStatus value="Provider Malfunction"/>
<smsDeliveryStatus value="Provider Malfunction" id="23434"/>
<smsDeliveryStatus value="Delivery Failure"/>
<smsDeliveryStatus value="Delivery Successful" id="2"/>
<smsDeliveryStatus value="Provider Malfunction"/>
<smsDeliveryStatus value="Delivery Failure"/>
<smsDeliveryStatus value="Delivery Successful" id="2"/>
<smsDeliveryStatus value="Provider Malfunction" id="23434"/>
<smsDeliveryStatus value="Delivery Failure"/>
<smsDeliveryStatus value="Provider Malfunction"/>
bash$

当我再次运行代码时,我得到以下内容:

bash$ cat blah.txt |  sed -nr "/smsDeliveryStatus /{h; /value/ {s/.*value=\"([^\"]*)?\".*/value: \1/}; /value/! {s/.*/value: Unknown/}; p; x; /id/ {s/.*id=\"([^\"]*)\".*/id: \1/g}; /id/! {s/.*/id: Unknown/g}; p}"
value: Provider Malfunction
<smsDeliveryStatus value="Provider Malfunction"/>
value: Provider Malfunction
id: 23434
value: Delivery Failure
id: Unknown
value: Delivery Successful
id: 2
value: Provider Malfunction
<smsDeliveryStatus value="Provider Malfunction"/>
value: Delivery Failure
id: Unknown
value: Delivery Successful
id: 2
value: Provider Malfunction
id: 23434
value: Delivery Failure
id: Unknown
value: Provider Malfunction
<smsDeliveryStatus value="Provider Malfunction"/>
bash$ 

这让我看到所有不匹配的行都有字母id,所以我使用\ b围绕id的字边界解决了它,如下所示:

bash$ cat blah.txt |  sed -nr "/smsDeliveryStatus /{h; /value/ {s/.*value=\"([^\"]*)?\".*/value: \1/}; /value/! {s/.*/value: Unknown/}; p; x; /\bid\b/ {s/.*id=\"([^\"]*)\".*/id: \1/g}; /\bid\b/! {s/.*/id: Unknown/g}; p}"
value: Provider Malfunction
id: Unknown
value: Provider Malfunction
id: 23434
value: Delivery Failure
id: Unknown
value: Delivery Successful
id: 2
value: Provider Malfunction
id: Unknown
value: Delivery Failure
id: Unknown
value: Delivery Successful
id: 2
value: Provider Malfunction
id: 23434
value: Delivery Failure
id: Unknown
value: Provider Malfunction
id: Unknown
bash$ cat blah.txt

所以最后我自己解决了。我希望这可以帮助别人。

A