我有一个大文件(file_new.txt),其中1组属性及其值即将出现次数。现在在某个集合中,与一个示例文件(sample.txt)属性相比,某些属性及其值会丢失。
Sample.txt的
apple = 0
black = 0
cat = 0
dog = 0
elephant = 0
file_next.txt
apple = 6
black = 7
elephant = 8
==============
apple=9
cat = 10
elephant =11
我在这里寻求输出如下(sample.txt中缺少的属性应该在file_new.txt中添加,值为零)
file_output.txt
apple = 6
black = 7
cat = 0
dog = 0
elephant = 8
=============
apple = 9
black = 0
cat = 10
dog = 0
elephant = 11
注意=第一个和最后一个属性值是永久性的(这里是苹果和大象)
由于
答案 0 :(得分:1)
$ cat tst.awk
BEGIN { FS="[[:space:]]*=[[:space:]]s*"; OFS=" = " }
NR==FNR { names[++numNames] = $1; dflt[$1] = $2; next }
/^=+$/ { prtRec(); print }
{ curr[$1] = $2 }
END { prtRec() }
function prtRec() {
for (nameNr=1; nameNr<=numNames; nameNr++) {
name = names[nameNr]
print name, (name in curr ? curr[name] : dflt[name])
}
delete curr
}
$ awk -f tst.awk sample.txt file_next.txt
apple = 6
black = 7
cat = 0
dog = 0
elephant = 8
==============
apple = 9
black = 0
cat = 10
dog = 0
elephant = 11
或者如果你不关心每个输出记录中行的顺序,它甚至更简单:
$ cat tst2.awk
BEGIN { FS="[[:space:]]*=[[:space:]]*"; OFS=" = " }
NR==FNR { dflt[$1] = $2; next }
/^=+$/ { prtRec(); print }
{ curr[$1] = $2 }
END { prtRec() }
function prtRec() {
for (name in dflt) {
print name, (name in curr ? curr[name] : dflt[name])
}
delete curr
}
$ awk -f tst2.awk sample.txt file_next.txt
apple = 6
elephant = 8
cat = 0
black = 7
dog = 0
==============
apple = 9
elephant = 11
cat = 10
black = 0
dog = 0
答案 1 :(得分:0)
awk -F '[[:blank:]]*=[[:blank:]]*' '
function Feed() {
for( Key in ToAdd){
if( ToAdd[ Key] == 1) print Sample[ Key]
else ToAdd[ Key] = 1
}
return
}
FNR == NR { Sample[$1]=$0;ToAdd[$1]=1}
FNR != NR && $0 !~ /^=====/ { ToAdd[ $1]=0; print }
$0 ~ /^=====/ { Feed(); print }
END { Feed() }
' Sample.txt file_new.txt
使用:
=====
之前和之后)文件顺序是强制性的