如何使用awk处理两个交集文件

时间:2019-02-22 11:20:20

标签: awk

我有两个文件。每个文件都有唯一的job_id值。

shop_file.txt此文件的job_id值始终等于21。

407486;{"shop_id":"407486","job_id":21}
163181148;{"shop_id":"163181148","job_id":21}
1510942977;{"shop_id":"1510942977","job_id":21}

dish_file.txt此文件的job_id值始终等于23。

7491777303;{"shop_id":"407486","dish_id":"7491777303","job_id":23}
1667364700;{"shop_id":"1664150969","dish_id":"1667364700","job_id":23}
1932540486;{"shop_id":"1932534033","dish_id":"1932540486","job_id":23}
1932540468;{"shop_id":"1932534033","dish_id":"1932540468","job_id":23}
1932540477;{"shop_id":"1932534033","dish_id":"1932540477","job_id":23}
1932540456;{"shop_id":"1932534033","dish_id":"1932540456","job_id":23}
1673778516;{"shop_id":"1478428493","dish_id":"1673778516","job_id":23}
1673462967;{"shop_id":"1642179256","dish_id":"1673462967","job_id":23}
1721592562;{"shop_id":"1697153440","dish_id":"1721592562","job_id":23}
8491777303;{"shop_id":"407486","dish_id":"8491777303","job_id":23}

相同的shop_id值必须存在两个文件,如何使用awk获取此结果。

7491777303;{"shop_id":"407486","dish_id":"7491777303","job_id":23}
8491777303;{"shop_id":"407486","dish_id":"8491777303","job_id":23}

2 个答案:

答案 0 :(得分:2)

理想情况下,应该这样解析JSON数据,但要假定结构是固定的,并且如图所示是一致的(重要的是,shop_id是第一个键,其值中不得包含任何逗号):

$ awk -F'[;,]' 'NR==FNR {a[$2]; next} $2 in a' shop_file.txt dish_file.txt
7491777303;{"shop_id":"407486","dish_id":"7491777303","job_id":23}
8491777303;{"shop_id":"407486","dish_id":"8491777303","job_id":23}

答案 1 :(得分:1)

下面的grepsed衬里应该对给定的示例有所帮助:

grep -f <(sed 's/.*\("shop_id"[^,]*\).*/\1/ shopFile) dishFile