我遇到与使用for-loop相关的gawk 3.1.5(Linux)的性能问题。
运行此代码
BEGIN {
TagOpen["Req"] = "<S:Envelope"
TagClose["Req"] = "<\\/S:Envelope"
TagOpen["Resp"] = "<SOAP-ENV:Envelope"
TagClose["Resp"]= "<\\/SOAP-ENV:Envelope"
}
{ if ( NR % 10000 == 0 ) print NR }
{
for (i in TagOpen) {
if ( match($0, TagOpen[i]) ) printf "Open [%s]\n", i
if ( match($0, TagClose[i]) ) printf "Close [%s]\n", i
}
}
在800,000行文本文件中:
real 0m56.84s
user 0m56.02s
sys 0m0.29s
运行明显相同的
BEGIN {
TagOpen["Req"] = "<S:Envelope"
TagClose["Req"] = "<\\/S:Envelope"
TagOpen["Resp"] = "<SOAP-ENV:Envelope"
TagClose["Resp"]= "<\\/SOAP-ENV:Envelope"
}
{ if ( NR % 10000 == 0 ) print NR }
{
i="Req"
if ( match($0, TagOpen[i]) ) printf "Open [%s]\n", i
if ( match($0, TagClose[i]) ) printf "Close [%s]\n", i
i="Resp"
if ( match($0, TagOpen[i]) ) printf "Open [%s]\n", i
if ( match($0, TagClose[i]) ) printf "Close [%s]\n", i
}
需要:
real 0m3.36s
user 0m3.23s
sys 0m0.21s
我无法相信我所看到的!
有什么想法吗?
P.S。这并不适用于&#34;遗产&#34; awk on HP-UX