如何编写bash脚本来过滤日志中的一些数据

时间:2015-10-19 18:47:16

标签: bash shell awk grep cut

我有一个这种格式的日志文件。

-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

我想提取人的姓名和年份,如果年数大于某些年份(比如说2),那么日志文件中的每个名称。该文件也会有重复的名称和不同的细节。

输出:

name:abc
yearLived:5
name:xyz
yearsLived: 3

我试图使用grep和cut命令来做到这一点。我面临的问题是,一旦我做了grep或cut,我就失去了另一部分,即姓名或地址。我该如何解决这个问题?

2 个答案:

答案 0 :(得分:1)

这是一个刺痛:

awk 'BEGIN {RS = "name="} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print $1 "\t" years[2]}' records_file

编辑:容纳更新的日志行样本和所需的输出:

awk 'BEGIN {RS = "-{59}"} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", $1); print $1 "\n" yl[0]}' records

编辑2:糟糕,用于添加评论:要更改匹配年数的阈值,请更改2中的第二个years[2] > 2。希望有所帮助。

答案 1 :(得分:0)

使用awk之类的

awk '$0~/^name/{split($0,a,"=")}{if($0~/yearsLived:[3-9]/){split($0,b,":|/");print "name:",a[2] "\nyearsLived: "b[9]}}' 'my_file'

打破一行shell代码

创建名为awkscript的文本文件并添加以下代码

#!/bin/awk
$0~/^name/{
    #find all lines that has name and reference in using an array 'a'   
    split($0,a,"=") 
          }
#find all lines that has years lived >2 and print name and years lived
{if($0~/yearsLived:[3-9]/){ 
    split($0,b,":|/");print "name:",a[2] "\nyearsLived: "b[9] #print name and year
}
}

现在在你的shell中运行awk脚本,如

 awk -f 'awkscript'  'my_file'