通过比较列中的值(使用awk)

时间:2017-05-15 20:03:49

标签: bash shell awk

说我有2个文件 - file1.csvfile2.csv。我需要比较两个文件的第2列(字符串值),并在file2.csv中打印出第3列中file1.csv第3列中不存在的值的行。

我尝试使用以下awk命令:

awk -F'\t''NR==FNR{c[$3]++;next};c[$3] == 0' file1.csv file2.csv

然而,这只给了我所有的file2.csvfile2.csv中只有2个额外行file1.csv中没有。

有人能告诉我这是错的吗?

file1.csv的片段(列从0开始编号)

ANR     26545   CallExpression                  mutex_unlock ( & mmc_test_lock )
ANR     26546   Callee                          mutex_unlock
ANR     26547   Identifier                      mutex_unlock
ANR     26548   ArgumentList                    & mmc_test_lock
ANR     26549   Argument                        & mmc_test_lock
ANR     26550   UnaryOperationExpression        & mmc_test_lock
ANR     26551   UnaryOperator                   &
ANR     26552   Identifier                      mmc_test_lock
ANR     26553   ExpressionStatement             "__free_pages ( test -> highmem , BUFFER_ORDER )"
ANR     26554   CallExpression                  "__free_pages ( test -> highmem , BUFFER_ORDER )" 
ANR     26555   Callee                          __free_pages 
ANR     26556   Identifier                      __free_pages
ANR     26557   ArgumentList                    test -> highmem
ANR     26558   Argument                        test -> highmem 
ANR     26559   PtrMemberAccess                 test -> highmem
ANR     26560   Identifier                      test
ANR     26561   Identifier                      highmem
ANR     26562   Argument                        BUFFER_ORDER
ANR     26563   Identifier                      BUFFER_ORDER 

file2.csv

的摘录
ANR     12910   CallExpression                  mutex_unlock ( & mmc_test_lock )
ANR     12911   Callee                          mutex_unlock
ANR     12912   Identifier                      mutex_unlock
ANR     12913   ArgumentList                    & mmc_test_lock
ANR     12914   Argument                        & mmc_test_lock
ANR     12915   UnaryOperationExpression        & mmc_test_lock
ANR     12916   UnaryOperator                   & 
ANR     12917   Identifier                      mmc_test_lock 
ANR     12918   IfStatement                     if ( test -> highmem )
ANR     12919   Condition                       test -> highmem 
ANR     12920   PtrMemberAccess                 test -> highmem
ANR     12921   Identifier                      test
ANR     12922   Identifier                      highmem
ANR     12923   ExpressionStatement             "__free_pages ( test -> highmem , BUFFER_ORDER )"
ANR     12924   CallExpression                  "__free_pages ( test -> highmem , BUFFER_ORDER )" 
ANR     12925   Callee                          __free_pages
ANR     12926   Identifier                      __free_pages
ANR     12927   ArgumentList                    test -> highmem
ANR     12928   Argument                        test -> highmem
ANR     12929   PtrMemberAccess                 test -> highmem
ANR     12930   Identifier                      test
ANR     12931   Identifier                      highmem
ANR     12932   Argument                        BUFFER_ORDER
ANR     12933   Identifier                      BUFFER_ORDER

预期产出:

ANR     12918   IfStatement     if ( test -> highmem )
ANR     12919   Condition       test -> highmem 

1 个答案:

答案 0 :(得分:2)

您需要将awk命令更改为:

awk -F'\t' 'NR==FNR {seen[$2]; next} !($2 in seen)' file1.csv file2.csv