使用长行从文本文件中提取特定值

时间:2013-06-27 13:24:32

标签: regex awk

我正在尝试从以下日志文​​件中获取所有“CP”值:

2013-06-27 17:00:00,017 INFO - [AlertSchedulerThread18] [2013-06-27 16:59:59, 813] -- SN: 989333333333 ||DN: 989333333333 ||CategoryId: 4687 ||CGID: null||Processing started ||Billing started||Billing Process: 97 msec ||Response code: 2001 ||Package id: 4387 ||TransactionId: 66651372336199820989389553437483742||CDR:26 msec||CDR insertion: 135 msec||Successfully inserted in CDR Table||CP:53 msec||PROC - 9 msec||Successfully executed procedure call.||Billing Ended||197 msec ||Processing ended
2013-06-27 17:00:00,018 INFO - [AlertSchedulerThread62] [2013-06-27 16:59:59, 824] -- SN: 989333333333 ||DN: 989333333333 ||CategoryId: 3241 ||CGID: null||Processing started ||Billing started||Billing Process: 61 msec ||Response code: 2001 ||Package id: 2861 ||TransactionId: 666513723361998319893580191324005184||CDR:25 msec||CDR insertion: 103 msec||Successfully inserted in CDR Table||CP:59 msec||PROC - 24 msec||Successfully executed procedure call.||Billing Ended||187 msec ||Processing ended
2013-06-27 17:00:00,028 INFO - [AlertSchedulerThread29] [2013-06-27 16:59:59, 903] -- SN: 989333333333 ||DN: 989333333333 ||CategoryId: 4527 ||CGID: null||Processing started ||Billing started||Billing Process: 47 msec ||Response code: 2001 ||Package id: 4227 ||TransactionId: 666513723361999169893616006323701572||CDR:22 msec||CDR insertion: 83 msec||Successfully inserted in CDR Table||CP:21 msec||PROC - 7 msec||Successfully executed procedure call.||Billing Ended||112 msec ||Processing ended

...得到这样的输出:

CP:53 msec
CP:59 msec
CP:21 msec

我怎么能用awk做到这一点?

6 个答案:

答案 0 :(得分:3)

对于这些事情,

cut总是好又快:

$ cut -d"*" -f3 file
CP:53 msec
CP:59 msec
CP:21 msec

无论如何,这些awk方式可以实现:

$ awk -F"|" '{print $27}' file  | sed 's/*//g'
CP:53 msec
CP:59 msec
CP:21 msec

$ awk -F"\|\|" '{print $14}' file | sed 's/*//g'
CP:53 msec
CP:59 msec
CP:21 msec

或者

$ awk -F"*" '{print $3}' file
CP:53 msec
CP:59 msec
CP:21 msec

在两者中,我们设置字段分隔符以将字符串拆分为某个特定字符|*。然后我们打印分割文本的某个块。

答案 1 :(得分:2)

一个搞笑的sed命令怎么样?

sed -n 's/.*\*\*\(.*\)\*\*.*/\1/p'

答案 2 :(得分:2)

使用awk

awk -F"[|*]+" '{ print $14 }' file

答案 3 :(得分:2)

$ awk -F'[|][|]' '{print $14}' file
**CP:53 msec**

**CP:59 msec**

**CP:21 msec**

如果输入中确实有'*',只需调整即可将其删除:

$ awk -F'[|][|]' '{gsub(/\*/,""); print $14}' file
CP:53 msec

CP:59 msec

CP:21 msec

答案 4 :(得分:2)

始终有grep

grep -o 'CP:[[:digit:]]* msec' log.txt

如果每次都不一定是msec,你可以把所有内容都拿到管道上:

grep -o 'CP:[^|]*' log.txt

答案 5 :(得分:2)

GNU代码

$sed -r 's/.*(CP:[0-9]+\smsec).*/\1/' file
CP:53 msec
CP:59 msec
CP:21 msec