这是来自Unexpected result comparing values of rows and columns in two text files
的后续问题我创建了一个结构来根据行和列比较两个文本文件。以下是文件结构:
file1.txt
Name Col1 Col2 Col3
-----------------------
row1 1 4 7
row2 2 5 8
row3 3 6 9
FILE2.TXT
Name Col1 Col2 Col3
-----------------------
row1 1 4 7
row2 2 5 999
这是我到目前为止的代码:
dos2unix ravi # 2>/dev/null
dos2unix ravi2 # 2>/dev/null
awk '
FNR < 2 {next} # skips first two lines
FNR == NR {
for (i = 2; i <= NF; i++) {
a[i,$1] = $i;
}
b[$1];
next;
}
($1 in b) { # check if row in file2 existed in file1
for (i = 2; i <= NF; i++) {
if (a[i,$1] == $i)
printf("%s->col%d: %s vs %s: Are Equal\n", $1, i-1, a[i,$1], $i);
else
printf("%s->col%d: %s vs %s: Not Equal\n", $1, i-1, a[i,$1], $i);
}
}
!($1 in b) { # check if row in file2 doesn't exist in file1.
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs %s: Are Not Equal\n", $1, i-1, "blank", $i);
}
// pattern needed to check if row in file1 doesn't exist in file2.
' $PWD/file1.txt $PWD/file2.txt
是否有人在awk
语句中有任何提示,建议或提示以检查file1中的行是否存在于file2中。请参阅下面的示例输出以了解我的意思。 (即:基本上,我想打印file1中row3的值在file2中不存在)。谢谢!如果需要进一步解释,请告诉我。
期望的输出:
row2->Col1: 1 vs 1: Equal
row2->Col2: 4 vs 4: Equal
row2->Col3: 7 vs 7: Equal
row1->Col1: 2 vs 2: Equal
row1->Col2: 5 vs 5: Equal
row1->Col3: 8 vs 999: Not Equal
row3->Col1: 3 vs (blank) : Not Equal
row3->Col2: 6 vs (blank) : Not Equal
row3->Col3: 9 vs (blank) : Not Equal
实际输出:
row2->Col1: 1 vs 1: Equal
row2->Col2: 4 vs 4: Equal
row2->Col3: 7 vs 7: Equal
row1->Col1: 2 vs 2: Equal
row1->Col2: 5 vs 5: Equal
row1->Col3: 8 vs 999: Not Equal
答案 0 :(得分:4)
扩展答案:
$ cat script.awk
FNR < 2 { next } # skips first two lines
FNR == NR {
for (i = 2; i <= NF; i++) { a[i,$1] = $i }
b[$1];
next;
}
($1 in b) { # check if row in file2 existed in file1
for (i = 2; i <= NF; i++) {
if (a[i,$1] == $i)
printf("%s->col%d: %s vs %s: Are Equal\n", $1, i-1, a[i,$1], $i);
else
printf("%s->col%d: %s vs %s: Not Equal\n", $1, i-1, a[i,$1], $i);
}
delete b[$1]; # delete entries which are processed
}
END {
for (left in b) { # look which didn't match
for (i = 2; i <= NF; i++)
printf("%s->col%d: %s vs (blank): Not Equal\n", left, i-1, a[i,left])
}
}
像以下一样运行:
$ awk -f script.awk file1 file2
row1->col1: 1 vs 1: Are Equal
row1->col2: 4 vs 4: Are Equal
row1->col3: 7 vs 7: Are Equal
row2->col1: 2 vs 2: Are Equal
row2->col2: 5 vs 5: Are Equal
row2->col3: 8 vs 999: Not Equal
row3->col1: 3 vs (blank): Not Equal
row3->col2: 6 vs (blank): Not Equal
row3->col3: 9 vs (blank): Not Equal
答案 1 :(得分:1)
如果您知道每一行&#34;名称&#34; (第一列)最多会出现在每个文件中一次,然后您可以在delete b[$1]
块的末尾($1 in b)
移动!($1 in b)
块,然后添加{{1阻止它绕过END
中剩下的所有内容并打印出你的行。
b