在下面的bash
中,我尝试使用awk
来验证headers
文件tab-delimited
之间key
的顺序是否完全相同(text files
}具有字段和print FILENAME
的顺序,通常在目录中为3。
如果订单正确或在文件之间找到匹配项,则print FILENAME
具有预期的字段顺序,但如果订单在文件之间不匹配,则$i
会导致“订单of $ i不正确“,其中key
是使用Index Chr Start End Ref Alt Inheritance Score
作为订单时无序的字段。谢谢:))
键
Index Chr Start End Ref Alt Inheritance Score
1 1 10 100 A - . 2
FILE1.TXT
Index Chr Start End Ref Alt Inheritance
1 1 10 100 A - . 2
2 1 20 100 A - . 5
FILE2.TXT
Index Chr Start End Ref Alt Inheritance
1 1 10 100 A - . 2
2 1 20 100 A - . 5
3 1 75 100 A - . 2
4 1 25 100 A - . 5
file3.txt
for f in /home/cmccabe/Desktop/validate/*.txt ; do
bname=`basename $f`
awk '
FNR==NR {
order=(awk '!seen[$0]++ {lines[i++]=$0}
END {for (i in lines) if (seen[lines[i]]==1) print lines[i]})'
k=(awk '!seen[$0]++ {lines[i++]=$0}
END {for (i in lines) if (seen[lines[i]]==1) print lines[i]})'
if($order==$k) print FILENAME " has expected order of fields"
else
print FILENAME " order of $i is not correct"
}' key $f
done
AWK
/home/cmccabe/Desktop/validate/file1.txt has expected order of fields
/home/cmccabe/Desktop/validate/file2.txt order of Score is not correct
/home/cmccabe/Desktop/validate/file3.txt order of Score is not correct
所需的输出
<div class="div1">
<p>Hello There</p>
</div>
<div class="div-main">
<p>Hello There</p>
</div>
答案 0 :(得分:1)
鉴于这些输入,您可以执行以下操作:
awk 'FNR==NR{hn=split($0,header); next}
FNR==1 {n=split($0,fh)
for(i=1;i<=hn; i++)
if (fh[i]!=header[i]) {
printf "%s: order of %s is not correct\n" ,FILENAME, header[i]
next}
if (hn==n)
print FILENAME, "has expected order of fields"
else
print FILENAME, "has extra fields"
next
}' key f{1..3}
打印:
f1 has expected order of fields
f2 order of Score is not correct
f3 order of Score is not correct
答案 1 :(得分:1)
$ cat tst.awk
NR==FNR { split($0,keys); next }
FNR==1 {
allmatched = 1
for (i=1; i in keys; i++) {
if ($i != keys[i] ) {
printf "%s order of %s is not correct\n", FILENAME, keys[i]
allmatched = 0
}
}
if ( allmatched ) {
printf "%s has expected order of fields\n", FILENAME
}
nextfile
}
$ awk -f tst.awk key file1 file2 file3
file1 has expected order of fields
file2 order of Score is not correct
file3 order of Score is not correct
以上使用nextfile
的GNU awk来提高效率。使用其他awks只需删除该语句并接受将读取整个文件。
你没有在你的示例中包含一个标题出现在文件但是没有出现在键中的情况,所以我认为这不会发生,所以你不需要脚本来处理它。