我在一行中有以下数据。
2014-12-30 00:00:02,317 pool-14076-thread-3 DEBUG [com.fundamo.connector.airtime.service.AirtimeService] ERS Response XML - <soap:Envelope><soap:Body><TopUpPhoneAccountResult><MessageID>1913351092</MessageID><MessageRefID>BD9123000000003</MessageRefID><TopUpPhoneAccountStatus><StatusID>200</StatusID><Comment>Transaction Successful</Comment></TopUpPhoneAccountStatus><TopUpPhoneAccountAmountSent><Amount>2000</Amount><AmountExcludingTax>2000</AmountExcludingTax><TaxName/><TaxAmount>0</TaxAmount><PhoneNumber>1766910910</PhoneNumber><ResponseDateTime>20141230000002320</ResponseDateTime><ServiceType>PRETOP</ServiceType><CurrencyCode>TK</CurrencyCode></TopUpPhoneAccountAmountSent></TopUpPhoneAccountResult></soap:Body></soap:Envelope>
现在我想从他们那里获取一些价值。我用了这个命令:
cat ERS_RESPONSE_30Dec_atp11.txt |awk -F'<' '{print $1 "," $5 "," $7 "," $10 ","$12"," $16 "," $23}'
输出:
2014-12-30 00:00:02,317 pool-14076-thread-3 DEBUG [com.fundamo.connector.airtime.service.AirtimeService] ERS Response XML - ,MessageID>1913351092,MessageRefID>BD9123000000003,StatusID>200,Comment>Transaction Successful,Amount>2000,PhoneNumber>1766910910
但是,我只想要下面显示的字段。
2014-12-30 00:00:02,317 ,1913351092,BD9123000000003,200,Transaction Successful,2000,1766910910
我该怎么办?
答案 0 :(得分:2)
以下是awk
awk -F"[ <>]" '{print $1" "$2,$18,$22,$28,$32" "$33,$41,$55}' OFS=, ERS_RESPONSE_30Dec_atp11.txt
2014-12-30 00:00:02,317,1913351092,BD9123000000003,200,Transaction Successful,2000,1766910910
这里有一些提示。
,<
和>
awk -F"[ <>]" '{for (i=1;i<=NF;i++) print i"="$i}' file
答案 1 :(得分:0)
你可以尝试如下(这有点长),然后是你的文件
sed 's#\([0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\} [0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\},\S*\).*<MessageID>\([[:digit:]]\{1,\}\)<.*<MessageRefID>\([[:alpha:]]\{1,\}[[:digit:]]\{1,\}\).*<StatusID>\([[:digit:]]\{1,\}\).*\(Transaction Successful\).*<Amount>\([[:digit:]]\{1,\}\).*<PhoneNumber>\([[:digit:]]\{1,\}\).*#\1 ,\2,\3,\4,\5,\6,\7#g'
将sed
替换为sed -i.bak
以备份原始文件并进行实际更改,一旦生效(命令行在我身边测试)
答案 2 :(得分:0)
您需要在Solaris而不是nawk
上使用awk
。在Solaris的awk
版本中,-F
参数只能使用一个字符,而在nawk
中,它可以采用正则表达式。
您需要将<.....>
的整个模式指定为分隔符,而不仅仅是<
:
适用于Mac:
$ awk -F'<[^<]+>' '{print $1 "," $5 "," $7 "," $10 ","$12"," $16 "," $23}' ERS_RESPONSE_30Dec_atp11.txt
在Solaris上尝试以下
$ nawk -F'<[^<]+>' '{print $1 "," $5 "," $7 "," $10 ","$12"," $16 "," $23}' ERS_RESPONSE_30Dec_atp11.txt
如果那不起作用......
$ nawk -F'<[^<][^<]*>' '{print $1 "," $5 "," $7 "," $10 ","$12"," $16 "," $23}' ERS_RESPONSE_30Dec_atp11.txt