比较两个文件中的第一列(如果匹配):更新最后一个列变量,否则:将行追加到第二个文件

时间:2019-08-08 18:37:43

标签: awk text-processing

我要获取file1的col1,如果file2的col1中存在匹配项,请更新最后一列中的“更新日期”。如果没有匹配项,我想将file1的整个行附加到file2,并将“更新日期”值也附加到该行。

我当前正在使用awk 'NR==FNR{c[$1]++;next};c[$1] > 0' file2 file1进行基线比较,但是如果存在匹配项,那会错误地打印整行,而且我也无法弄清楚如何添加另一个条件来更新日期列。我也在尝试在shell脚本中执行此操作。

文件1

userName | cpu% | command | date created

   user1, 101.6, plasma-de+, Thu Aug  8 09:30:17 MDT 2019
   user2, 100.0, plasma-de+, Thu Aug  8 09:30:17 MDT 2019
   user3, 102.0, plasma-de+, Thu Aug  8 09:30:17 MDT 2019

文件2

userName | cpu% | command | date created | date updated

    user1, 101.6, plasma-de+, Mon Aug  5 06:35:39 MDT 2019,    Mon Aug  5 06:35:39 MDT 2019 
    user2, 100.0, plasma-de+, Mon Aug  5 06:35:39 MDT 2019,    Mon Aug  5 06:35:39 MDT 2019

运行命令后的文件2

userName | cpu% | command | date created | date updated

    user1, 101.6, plasma-de+, Mon Aug  5 06:35:39 MDT 2019,    Thu Aug  8 09:30:17 MDT 2019
    user2, 100.0, plasma-de+, Mon Aug  5 06:35:39 MDT 2019,    Thu Aug  8 09:30:17 MDT 2019
    user3, 102.0, plasma-de+, Thu Aug  8 09:30:17 MDT 2019,    Thu Aug  8 09:30:17 MDT 2019

1 个答案:

答案 0 :(得分:0)

一种假设文件已排序的非确定方式:

$ (join -t, -j1 -o 0,2.2,2.3,2.4,1.4 file1 file2; \
   join -t, -j1 -v1 -o 0,1.2,1.3,1.4,1.4 file1 file2)
user1, 101.6, plasma-de+, Mon Aug  5 06:35:39 MDT 2019, Thu Aug  8 09:30:17 MDT 2019
user2, 100.0, plasma-de+, Mon Aug  5 06:35:39 MDT 2019, Thu Aug  8 09:30:17 MDT 2019
user3, 102.0, plasma-de+, Thu Aug  8 09:30:17 MDT 2019, Thu Aug  8 09:30:17 MDT 2019

或使用无限制的awk:

$awk 'BEGIN { FS = OFS = "," }
      NR == FNR { a[$1] = $0; b[$1] = $4; next }
      $1 in a { $5 = b[$1]; delete a[$1]; print }
      END { for (u in a) print a[u], b[u] }' file1 file2
user1, 101.6, plasma-de+, Mon Aug  5 06:35:39 MDT 2019, Thu Aug  8 09:30:17 MDT 2019
user2, 100.0, plasma-de+, Mon Aug  5 06:35:39 MDT 2019, Thu Aug  8 09:30:17 MDT 2019
user3, 102.0, plasma-de+, Thu Aug  8 09:30:17 MDT 2019, Thu Aug  8 09:30:17 MDT 2019