Question

我有一个这种格式的日志文件。

-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------

我想提取人的姓名和年份，如果年数大于某些年份（比如说2），那么日志文件中的每个名称。该文件也会有重复的名称和不同的细节。

输出：

name:abc
yearLived:5
name:xyz
yearsLived: 3

我试图使用grep和cut命令来做到这一点。我面临的问题是，一旦我做了grep或cut，我就失去了另一部分，即姓名或地址。我该如何解决这个问题？

Answer 1

这是一个刺痛：

awk 'BEGIN {RS = "name="} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print $1 "\t" years[2]}' records_file

编辑：容纳更新的日志行样本和所需的输出：

awk 'BEGIN {RS = "-{59}"} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", $1); print $1 "\n" yl[0]}' records

编辑2：糟糕，用于添加评论：要更改匹配年数的阈值，请更改2中的第二个years[2] > 2。希望有所帮助。

Answer 2

使用awk之类的

awk '$0~/^name/{split($0,a,"=")}{if($0~/yearsLived:[3-9]/){split($0,b,":|/");print "name:",a[2] "\nyearsLived: "b[9]}}' 'my_file'

打破一行shell代码

创建名为awkscript的文本文件并添加以下代码

#!/bin/awk
$0~/^name/{
    #find all lines that has name and reference in using an array 'a'   
    split($0,a,"=") 
          }
#find all lines that has years lived >2 and print name and years lived
{if($0~/yearsLived:[3-9]/){ 
    split($0,b,":|/");print "name:",a[2] "\nyearsLived: "b[9] #print name and year
}
}

现在在你的shell中运行awk脚本，如

 awk -f 'awkscript'  'my_file'

如何编写bash脚本来过滤日志中的一些数据

2 个答案: