用于从shell中的行获取不同字符串的命令

时间:2016-01-06 17:41:40

标签: shell awk sed substr

我有一个文本文件。文件中的一些示例行如下所示

MESS01: Java flow 'com.java.abc.SupportToolsOutput' on jvm group 'JVM123' is running.
MESS01: Java flow 'com.java.abc.ErrorNotify' on jvm group 'JVM123' is running.
MESS01: Java flow 'com.java.abc.Output' on jvm group 'JVM123' is running.
MESS01: Java flow 'com.java.abc.LogRequest' on jvm group 'JVM123' is running.
MESS01: Java flow 'com.java.abc.Router' on jvm group 'JVM123' is running.
MESS01: Java flow 'com.java.abc.ProcessMessageNextGen' on jvm group 'JVM123' is running.
MESS01: Java flow 'com.java.abc.RouteMessage' on jvm group 'JVM123' is stopped.`

我尝试使用单个shell命令获取输出

com.java.abc.SupportToolsOutput,running
com.java.abc.ErrorNotify,running
com.java.abc.Output,running 
com.java.abc.LogRequest,running
com.java.abc.Router,running
com.java.abc.ProcessMessageNextGen,running
com.java.abc.RouteMessage,stopped`

我尝试使用substr和awk。

我尝试cat textfile.txtt|awk '{print substr($4,2,length($4)-1)}'|sed "s/'/ /g"cat textfile.txt|awk '{ print $4,$10 }'|sed "s/'/ /g",但无法获得所需的结果。

请帮忙。

更新:如果我的文本文件是这样的

MESS01: Java flow 'com.java.abc.SupportToolsOutput' on jvm group 'JVM123' is running.

Additional thread instances: '0'
Deployed: '1/11/12 2:44 AM' in Bar file '/www/deploy/JVM123/SupportToolsOutputDEV_2012-01-11_02-44-27.bar'
Last edited: '1/10/12 5:02 PM'
UUID: 'f9a9f0cb-3401-0000-0080-b85eb6410185'
Start mode: 'Maintained'
Long description: ''
User-defined property names:
  'BackOutThreshold' = '1'
  'LogLevel' = 'ERROR'
  'MaxPerInterval' = '5'
  'NotificationInterval' = '300'
Keywords:

--------
MESS01: Java flow 'com.java.abc.ErrorNotify' on jvm group 'JVM123' is running.

Additional thread instances: '0'
Deployed: '1/11/12 2:45 AM' in Bar file '/www/deploy/JVM123/ErrorNotifyDEV_2012-01-11_02-45-45.bar'
Last edited: '1/10/12 5:04 PM'
UUID: 'efcff1cb-3401-0000-0080-b85eb6410185'
Start mode: 'Maintained'
Long description: ''
User-defined property names:
  'LogLevel' = 'ERROR'
  'MaxPerInterval' = '5'

Keywords:

--------
MESS01: Java flow 'com.java.abc.Output' on jvm group 'JVM123' is running.

Additional thread instances: '0'
Deployed: '1/11/12 2:46 AM' in Bar file '/www/deploy/JVM123/OutputDEV_2012-01-11_02-46-44.bar'
Last edited: '1/10/12 3:30 PM'
UUID: '1fbbf2cb-3401-0000-0080-b85eb6410185'
Start mode: 'Maintained'
Long description: ''
User-defined property names:
  'BackOutThreshold' = '1'
  'BasicAuthorization' = 'YWRtaW46cGFzc3dvcmQ='
  'LogLevel' = 'ERROR'
  'MaxPerInterval' = '5'
  'NotificationInterval' = '300'
  'ProxyAuthorization' = 'QTkwNzk2MzpnNzVuajZqcQ=='
  'isSslSecured' = 'FALSE'
Keywords:

--------
MESS01: Java flow 'com.java.abc.LogRequest' on jvm group 'JVM123' is running.

Additional thread instances: '4'
Deployed: '1/11/12 2:48 AM' in Bar file '/www/deploy/JVM123/LogRequestDEV_2012-01-11_02-48-54.bar'
Last edited: '1/10/12 4:00 PM'
UUID: '60b4f4cb-3401-0000-0080-b85eb6410185'
Start mode: 'Maintained'
Long description: ''
User-defined property names:
  'EVENTTYPE' = 'Integration_RequestSent'
  'LogLevel' = 'ERROR'
  'MaxPerInterval' = '5'
  'NotificationInterval' = '300'
  'SOARTMCompliant' = 'FALSE'
Keywords:

--------
MESS01: Java flow 'com.java.abc.Router' on jvm group 'JVM123' is stopped.

Additional thread instances: '4'
Deployed: '1/11/12 2:49 AM' in Bar file '/www/deploy/JVM123/RouterDEV_2012-01-11_02-49-32.bar'
Last edited: '1/10/12 4:10 PM'
UUID: '8d46f5cb-3401-0000-0080-b85eb6410185'
Start mode: 'Maintained'
Long description: ''
User-defined property names:
  'BackOutThreshold' = '1'
  'LogLevel' = 'ERROR'
  'MaxPerInterval' = '5'
  'NotificationInterval' = '300'
Keywords:
--------
MESS02 : Java file 'Integration.jar' on on jvm group 'JVM123'. 
Deployed: '1/11/12 2:46 AM' in Bar file '/www/deploy/JVM123/OutputDEV_2012-01-11_02-46-44.bar'
Last edited: '1/10/12 4:10 PM'
Keywords:

--------
MESS02 : Java file 'SAPAdapter.adapter' on on jvm group 'JVM123'. 
Deployed: '1/11/12 2:46 AM' in Bar file '/www/deploy/JVM123/OutputDEV_2011-11-10_22-55-55.bar'
Last edited: '1/10/14 14:55 PM'
Keywords:

我希望我的输出为

JVM123,/www/deploy/JVM123/OutputDEV_2012-01-11_02-46-44.bar,Integration.jar
JVM123,/www/deploy/JVM123/OutputDEV_2011-11-10_22-55-55.bar,SAPAdapter.adapter
JVM123,/www/deploy/JVM123/SupportToolsOutputDEV_2012-01-11_02-44-27.bar,com.java.abc.SupportToolsOutput,running
 JVM123,/www/deploy/JVM123/ErrorNotifyDEV_2012-01-11_02-45-45.bar,com.java.abc.ErrorNotify,running
 JVM123,/www/deploy/JVM123/OutputDEV_2012-01-11_02-46-44.bar,com.java.abc.Output,running 
JVM123,/www/deploy/JVM123/LogRequestDEV_2012-01-11_02-48-54.bar,com.java.abc.LogRequest,running
 JVM123,/www/deploy/JVM123/RouterDEV_2012-01-11_02-49-32.bar,com.java.abc.Router,stopped

5 个答案:

答案 0 :(得分:2)

使用awk你可以这样做:

awk -F "[' ]+" '{print $4 "," $NF}' textfile.txtt
com.java.abc.SupportToolsOutput,running.
com.java.abc.ErrorNotify,running.
com.java.abc.Output,running.
com.java.abc.LogRequest,running.
com.java.abc.Router,running.
com.java.abc.ProcessMessageNextGen,running.
com.java.abc.RouteMessage,stopped.`

要从上一个字段中删除一个DOT,请使用:

awk -F "[' ]+" '{sub(/\./, "", $NF); print $4 "," $NF}' textfile.txtt

答案 1 :(得分:1)

另一个awk

$ awk -v q="'" '{gsub(q,""); print $4 "," $NF}' log

com.java.abc.SupportToolsOutput,running.
com.java.abc.ErrorNotify,running.
com.java.abc.Output,running.
com.java.abc.LogRequest,running.
com.java.abc.Router,running.
com.java.abc.ProcessMessageNextGen,running.
com.java.abc.RouteMessage,stopped.

这一个删除最后一段时间,如果重要

$ awk -v q="'" '{gsub(q,""); sub(/\.$/,""); print $4","$NF}' log

com.java.abc.SupportToolsOutput,running
com.java.abc.ErrorNotify,running
com.java.abc.Output,running
com.java.abc.LogRequest,running
com.java.abc.Router,running
com.java.abc.ProcessMessageNextGen,running
com.java.abc.RouteMessage,stopped

答案 2 :(得分:1)

这个人在看到 MESS01 并在 Deployed 上打印时抓住了状态。它使用简单直接的正则表达式进行字段分隔符:

LC_ALL=C awk -F "[ ']" -v OFS=, '/^MESS01:/ { sub(/\.$/,""); o=$5; j=$11; s=$NF; } /^Deployed:/ { print j, $(NF-1), o, s }' textfile.txt

经过测试......它可以产生你想要的东西。注意:

  • 没有 cat 在此脚本的修改中受到伤害。 ;)
  • 您的日志看起来都是ASCII,使用LC_ALL = C可能会使脚本显着加快。
  • 使用NF有助于避免日期出现任何问题并且是直观的(在答案中,我告诉读者我在代码中的意图是查看一个案例中的最后一个字段以及另一个案例中的最后一个字段旁边)。
  • MESS01 中sub()的原因是添加“。”在分隔符正则表达式会杀死你的对象。
  • 将来,您可能会发现最好只使用空格作为分隔符并使用gsub()过滤掉jvm,对象和路径中的^'|' - 在这种情况下,您需要传递过滤正则表达式作为变量(由于报价混淆)。

这是使用上一点中提到的gsub()的版本:

LC_ALL=C awk -v OFS=, -v r="^'|'$" '/^MESS01:/ { o=$4; j=$8; s=$NF; gsub(r,"",o); gsub(r,"",j); sub(/\.$/,"",s) } /^Deployed:/ { p=$NF; gsub(r,"",p); print j, p, o, s }' textfile.txt

答案 3 :(得分:0)

你的意思是:

 cat test.txt|awk -F\  '{ print $4","$10 }' | sed "s/\'//g"|sed "s/\.$//"

给出

com.java.abc.SupportToolsOutput running
com.java.abc.ErrorNotify running
com.java.abc.Output running
com.java.abc.LogRequest running
com.java.abc.Router running
com.java.abc.ProcessMessageNextGen running
com.java.abc.RouteMessage stopped

答案 4 :(得分:0)

透视,这是一个sed解决方案。

如果您只是根据示例解析文本,则以下内容可能就足够了:

sed "s/[^']*'//;s/'.* /,/" logfile

这里的两个sed命令(1)将所有内容都删除到第一个单引号,(2)将所有内容从下一个单引号替换为该行的最后一个空格。将;s/.$//添加到此处以删除尾随时段。

如果您要解析在更新中添加的较大日志,则需要稍微调整一下。以下将使用扩展日志生成您最初请求的输出:

sed -n "/^MESS01:/{;s/[^']*'//;s/'.* /,/;p;}" logfile

这与早期的脚本相同,但仅限于以MESS01:开头的行。它也只打印与该模式匹配的行。

对于您在问题底部提到的输出,我建议您不要使用sed,因为您需要从基本上具有多行记录的日志中的不同字段中提取数据。 sed可能是可能的,但它会非常复杂。 Awk要好得多。