我试图弄清楚如何打印2个字符串之间发生的所有事情。问题是,这些字符串对在一行上多次出现。所以我需要能够在每组对中打印每个字段。
我有一个文件api.txt
,列出了多个客户及其各自的设备清单。它看起来像这样:
Customer [customerId=12000, customerName=Acme, Inc.]
DeviceDetail [baseProductId=router-100, cardDetail=[CardDetail [baseCardId=router-100NIC1, cardDescription=Router 100 NIC, cardSerial=100NIC1], CardDetail [baseCardId=router-100NIC2, cardDescription=Router 100 NIC, cardSerial=100NIC2]], deviceSerial=100PRIMARY, deviceDescription=Router 100 Base Model]
DeviceDetail [baseProductId=router-2500, cardDetail=[CardDetail [baseCardId=router-2500NIC1, cardDescription=Router 2500 NIC, cardSerial=2500NIC1], CardDetail [baseCardId=router-2500NIC2, cardDescription=Router 2500 NIC, cardSerial=2500NIC2]], deviceSerial=2500PRIMARY, deviceDescription=Router 2500 Base Model]
Customer [customerId=24000, customerName=Anvil LLC]
DeviceDetail [baseProductId=router-5000, cardDetail=[CardDetail [baseCardId=router-5000NIC1, cardDescription=Router 5000 NIC, cardSerial=5000NIC1], CardDetail [baseCardId=router-500NIC2, cardDescription=Router 5000 NIC, cardSerial=5000NIC2]], deviceSerial=5000PRIMARY, deviceDescription=Router 5000 Base Model]
DeviceDetail [baseProductId=router-7500, cardDetail=null, deviceSerial=7500PRIMARY, DeviceDescription=Router 7500 Base Model, No NIC]
这个的输出应该类似于:
"12000","Acme, Inc.","router-100","100PRIMARY","Router 100 Base Model","Router 100 NIC","100NIC1","Router 100 NIC","100NIC2"
"12000","Acme, Inc.","router-2500","2500Primary","Router 2500 Base Model","Router 2500 NIC","2500NIC1","Router 2500 NIC","2500NIC2"
"24000","Anvil LLC","router-5000","5000PRIMARY","Router 5000 Base Model,"Router 5000 NIC","5000NIC1","Router 5000 NIC","5000NIC2"
请注意,最后一个DeviceDetail(router-7500
)被省略,因为设备没有附加子设备(cardDetail=null
)。
我了解如何使用awk
将字段分隔符设置为=
和,
以捕获它们之间的所有内容(即每个字段值位于等号和逗号之间),但是我不确定当CardDetail
数据的多个实例在每一行上出现未知次数,甚至根本不出现时,如何获得我正在寻找的结果。
需要考虑的是Card Detail
的每个实例都在Card Detail
和封闭式括号(]
)之间捕获,因此可能有助于捕获Card Detail
的每个实例在每一行,但不确定。
我也没跟awk
结婚。使用sed
或任何其他解析程序也可以。基本上,无论什么效果最好。
提前感谢您提供的任何帮助!
答案 0 :(得分:2)
当在awk / sed中处理过于笨拙时,是时候使用更“现代”的脚本语言了,比如perl,ruby或python。这样的事情应该让你开始(perl):
#!/usr/bin/env perl
use strict;
use warnings;
my $customerName;
my $customerId;
while (my $line = <DATA>) {
if ($line =~ m{
customerId=(?<customerId>.*?),
\ customerName=(?<customerName>.*)\]
}x)
{
$customerId = $+{customerId};
$customerName = $+{customerName};
} elsif ($line =~ m{
baseProductId=(?<baseProductId>.*?),
\ cardDetail=\[.*baseCardId=(?<baseCardId>.*?),
\ cardDescription=(?<cardDescription>.*?),
.*deviceSerial=(?<deviceSerial>.*?),
\ deviceDescription=(?<deviceDescription>.*)\]
}x)
{
my ($productId, $cardId) = ($1, $2);
print '"'
. join('","',
$customerId,
$customerName,
$+{baseProductId},
$+{baseCardId},
$+{deviceSerial},
$+{deviceDescription},
$+{cardDescription},
)
. "\"\n" ;
}
}
__DATA__
Customer [customerId=12000, customerName=Acme, Inc.]
DeviceDetail [baseProductId=router-100, cardDetail=[CardDetail [baseCardId=router-100NIC1, cardDescription=Router 100 NIC, cardSerial=100NIC1], CardDetail [baseCardId=router-100NIC2, cardDescription=Router 100 NIC, cardSerial=100NIC2]], deviceSerial=100PRIMARY, deviceDescription=Router 100 Base Model]
DeviceDetail [baseProductId=router-2500, cardDetail=[CardDetail [baseCardId=router-2500NIC1, cardDescription=Router 2500 NIC, cardSerial=2500NIC1], CardDetail [baseCardId=router-2500NIC2, cardDescription=Router 2500 NIC, cardSerial=2500NIC2]], deviceSerial=2500PRIMARY, deviceDescription=Router 2500 Base Model]
Customer [customerId=24000, customerName=Anvil LLC]
DeviceDetail [baseProductId=router-5000, cardDetail=[CardDetail [baseCardId=router-5000NIC1, cardDescription=Router 5000 NIC, cardSerial=5000NIC1], CardDetail [baseCardId=router-500NIC2, cardDescription=Router 5000 NIC, cardSerial=5000NIC2]], deviceSerial=5000PRIMARY, deviceDescription=Router 5000 Base Model]
DeviceDetail [baseProductId=router-7500, cardDetail=null, deviceSerial=7500PRIMARY, DeviceDescription=Router 7500 Base Model, No NIC]
您可以在x
中找到匹配m{}
运算符的perldoc perlre
选项(搜索/x
修饰符。还搜索named
捕获组与$+{foo}
咒语相同的perldoc。