我要获取file1的col1,如果file2的col1中存在匹配项,请更新最后一列中的“更新日期”。如果没有匹配项,我想将file1的整个行附加到file2,并将“更新日期”值也附加到该行。
我当前正在使用awk 'NR==FNR{c[$1]++;next};c[$1] > 0' file2 file1
进行基线比较,但是如果存在匹配项,那会错误地打印整行,而且我也无法弄清楚如何添加另一个条件来更新日期列。我也在尝试在shell脚本中执行此操作。
文件1
userName | cpu% | command | date created
user1, 101.6, plasma-de+, Thu Aug 8 09:30:17 MDT 2019
user2, 100.0, plasma-de+, Thu Aug 8 09:30:17 MDT 2019
user3, 102.0, plasma-de+, Thu Aug 8 09:30:17 MDT 2019
文件2
userName | cpu% | command | date created | date updated
user1, 101.6, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Mon Aug 5 06:35:39 MDT 2019
user2, 100.0, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Mon Aug 5 06:35:39 MDT 2019
运行命令后的文件2
userName | cpu% | command | date created | date updated
user1, 101.6, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
user2, 100.0, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
user3, 102.0, plasma-de+, Thu Aug 8 09:30:17 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
答案 0 :(得分:0)
一种假设文件已排序的非确定方式:
$ (join -t, -j1 -o 0,2.2,2.3,2.4,1.4 file1 file2; \
join -t, -j1 -v1 -o 0,1.2,1.3,1.4,1.4 file1 file2)
user1, 101.6, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
user2, 100.0, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
user3, 102.0, plasma-de+, Thu Aug 8 09:30:17 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
或使用无限制的awk:
$awk 'BEGIN { FS = OFS = "," }
NR == FNR { a[$1] = $0; b[$1] = $4; next }
$1 in a { $5 = b[$1]; delete a[$1]; print }
END { for (u in a) print a[u], b[u] }' file1 file2
user1, 101.6, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
user2, 100.0, plasma-de+, Mon Aug 5 06:35:39 MDT 2019, Thu Aug 8 09:30:17 MDT 2019
user3, 102.0, plasma-de+, Thu Aug 8 09:30:17 MDT 2019, Thu Aug 8 09:30:17 MDT 2019