我有一个这种格式的日志文件。
-----------------------------------------------------------
name=abc
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:5/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
name=xyz
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:3/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
name=awd
address=country:US,Zip:12345/1,city:ny/1,state:ny/1, yearsLived:2/1,other details
healthDetails=healthplan:medixx/1, expensesInDollars:150/1, other details
-----------------------------------------------------------
我想提取人的姓名和年份,如果年数大于某些年份(比如说2),那么日志文件中的每个名称。该文件也会有重复的名称和不同的细节。
输出:
name:abc
yearLived:5
name:xyz
yearsLived: 3
我试图使用grep和cut命令来做到这一点。我面临的问题是,一旦我做了grep或cut,我就失去了另一部分,即姓名或地址。我该如何解决这个问题?
答案 0 :(得分:1)
这是一个刺痛:
awk 'BEGIN {RS = "name="} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {print $1 "\t" years[2]}' records_file
编辑:容纳更新的日志行样本和所需的输出:
awk 'BEGIN {RS = "-{59}"} NR > 1 {match($0, "yearsLived:[0-9]+", yl) ; split(yl[0], years, ":")} NR > 1 && years[2] > 2 {sub("=", ":", $1); print $1 "\n" yl[0]}' records
编辑2:糟糕,用于添加评论:要更改匹配年数的阈值,请更改2
中的第二个years[2] > 2
。希望有所帮助。
答案 1 :(得分:0)
使用awk之类的
awk '$0~/^name/{split($0,a,"=")}{if($0~/yearsLived:[3-9]/){split($0,b,":|/");print "name:",a[2] "\nyearsLived: "b[9]}}' 'my_file'
打破一行shell代码
创建名为awkscript
的文本文件并添加以下代码
#!/bin/awk
$0~/^name/{
#find all lines that has name and reference in using an array 'a'
split($0,a,"=")
}
#find all lines that has years lived >2 and print name and years lived
{if($0~/yearsLived:[3-9]/){
split($0,b,":|/");print "name:",a[2] "\nyearsLived: "b[9] #print name and year
}
}
现在在你的shell中运行awk脚本,如
awk -f 'awkscript' 'my_file'