文件1:
SA, 5006, 12, , DJ
CN, BN, , BBB, 13
22, 67, GG, FF, 88
33, BB, AA, CC, 22
file2的:
SA, 5006, 12, 15 , DJ
CN, BN, , BBB, 13
empty line
33, CC, AA, dd, 22
输出:
SA, 5006, 12, 15 , DJ, unmatch, 4
CN, BN, , BBB, 13, match
empt, empt, empt, empt, empt, unmatch, 12345
33, CC, AA, dd, 22, unmatch, 24
我需要逐行比较两个.csv文件,但有些字段/行可能为空,输出应该在file3中: 5列形成文件2,匹配\ unmatch,unmatch这样的字段:
c1, c2, c3, c4, c5, match/unmatch, concatenation of digits representing unmatch fields.
我尝试了一些东西,但是我知道awk可以帮助吗? :)
我使用的代码,但我认为问题是它的空字段anf我不知道如何打印:
##Set input and output field separators to ':'.
BEGIN {
FS = OFS = ":"
}
NR == FNR {
## save all the line in an array, so lines will be saved like:
## c1::c2::c3::c4::c5
++a[$0]
## Process next line from the beginning.
next
}
## for every line of second file.
{
## Search for the line in the array, if not exists it means that any field is different
## print the line.
if ( !a[$0] ) {
$6 = "same"
print
}else {
$6 = " not same"
print
}
}
答案 0 :(得分:2)
您需要使用行号作为文件之间保存的数组的索引,以便比较两个文件中的相应行。
BEGIN { FS = ", "; }
NR == FNR { a[FNR] = $0 } # In first file, just save each line in an array
NR != FNR { if (a[FNR] == $0) { # Compare line in 2nd file to corresponding line in first file
$6 = "match";
} else {
$6 = "unmatch";
split(a[FNR], b); # Split up the fields from the first file
$7 = ""
for (i = 1; i <= 5; i++) { # Compare each field
if ($i != b[i]) { $7 = $7 i; } # Add non-matching field numbers to output
}
}
print;
}