在每一行中搜索不同的模式(grep / awk / sed)

时间:2016-04-12 07:25:02

标签: bash awk sed grep

假设我有文本文件:

2016-02-10  [id-2555] data:{"flower":"hmm","Floatnumber":0.001067,"animal":"cat"} 2016-02-10  [id-2555] hello > bye (1120 > 1067.444)
2016-02-10  [id-2556] data:{"flower":"hmm","Floatnumber":0.001267,"animal":"cat"} 2016-02-10  [id-2556] hello > bye (1520 > 1267.555)
2016-02-10  [id-2556] data:{"flower":"hmm","Floatnumber":0.001367,"animal":"cat"} 2016-02-10  [id-2556] hello > bye (1820 > 1367.666)

是否可以检查每行中的Floatnumber是否等于括号中的第二个数字? 让我们说:

 "Floatnumber":0.001067 == int(1067.444)/100000?

知道我绕过第二个数字:

awk '{ int($11)/1000000 }'

但我不知道如何更改每一行中的模式并匹配它。

EDIT1。 如果模式匹配打印datajson或eq flower是否可能?

3 个答案:

答案 0 :(得分:1)

使用perl

更容易
$ cat data
2016-02-10 [id-2555] data:{"flower":"hmm","Floatnumber":0.001067,"animal":"cat"} 2016-02-10 [id-2555] hello > bye (1120 > 1067)
2016-02-10 [id-2556] data:{"flower":"hmm","Floatnumber":0.001267,"animal":"cat"} 2016-02-10 [id-2556] hello > bye (1520 > 1267)
2016-02-10 [id-2556] data:{"flower":"hmm","Floatnumber":0.001367,"animal":"cat"} 2016-02-10 [id-2556] hello > bye (1820 > 1367)
2016-02-10 [id-2556] data:{"flower":"hmm","Floatnumber":0.000367,"animal":"cat"} 2016-02-10 [id-2556] hello > bye (1820 > 1368)
# I have added a 4th line, where the condition does not match.

$ perl -nE 'm/"Floatnumber":([0-9.]*)/; my $a=$1; m/> ([0-9]*)\)$/; my $b=$1; say ((($a *1000000) == $b)?"true":"false");' <data
true
true
true
false

说明: 匹配"Floatnumber":&amp;之后的浮点数。将其保存在$a中。匹配最后的整数&amp;存储在$b
如果$a * 1000000 == $b,则打印为true。否则打印错误。

<强> EDIT1
如果匹配,则打印数据:

perl -nE 'm/"Floatnumber":([0-9.]*)/; my $a=$1; m/> ([0-9]*)\)$/; my $b=$1; m/data:{([^ ]*)}/; say ((($a *1000000) == $b)?$1:"NULL");' <data
"flower":"hmm","Floatnumber":0.001067,"animal":"cat"
"flower":"hmm","Floatnumber":0.001267,"animal":"cat"
"flower":"hmm","Floatnumber":0.001367,"animal":"cat"
NULL

如果找到匹配项,则打印花卉值:

perl -nE 'm/"Floatnumber":([0-9.]*)/; my $a=$1; m/> ([0-9]*)\)$/; my $b=$1; m/"flower":"([^"]*)"/; say ((($a *1000000) == $b)?$1:"NULL");' <data
hmm
hmm
hmm
NULL

答案 1 :(得分:0)

另一个建议:

awk -F'[:,()]|[[:blank:]]+' '{print $7*1000000==int($18)?"same":"different", $0}'

# or print only lines where the values are the same:
awk -F'[:,()]|[[:blank:]]+' '$7*1000000==int($18){print}'

# values are the same and contain "2555" in the string
awk -F'[:,()]|[[:blank:]]+' '$7*1000000==int($18) && /2555/{print}'

-F是Fied Separator的正则表达式。

答案 2 :(得分:0)

使用gawkjq

awk -F"[:, )]" '$8==int($(NF-1))/1000000{
    f=gensub(/[^:]+:([^}]+}) .+/, "\\1", "g",$0);
    print f
}'

将awk输出传递给jq -r '.flower'