我正在尝试找到一种方法,可能使用awk
在CSV文件的两行数据之间进行插值。现在,每行代表0点和6点的数据点。我希望填写0点和6点之间的丢失的小时数据。
当前CSV
lat,lon,fhr
33.90000,-76.50000,0
34.20000,-77.00000,6
预期插值输出
lat,lon,fhr
33.90000,-76.50000,0
33.95000,-76.58333,1
34.00000,-76.66667,2
34.05000,-76.75000,3
34.10000,-76.83333,4
34.15000,-76.91667,5
34.20000,-77.00000,6
答案 0 :(得分:1)
这是一个应该达到此目的的awk文件
# initialize lastTime, also used as a flag to show that the 1st data line has been read
BEGIN { lastTime=-100 }
# match data lines
/^[0-9]/{
if (lastTime == -100) {
# this is the first data line, print it
print;
} else {
if ($3 == lastTime+1) {
# increment of 1 hour, no need to interpolate
print;
} else {
# increment othet than 1 hour, interpolate
for (i = 1 ; i < $3 - lastTime; i = i + 1) {
print lastLat+($1-lastLat)*(i/($3 - lastTime))","lastLon+($2-lastLon)*(i/($3 - lastTime))","lastTime+i
}
print;
}
}
# save the current values for the next line
lastTime = $3;
lastLon = $2;
lastLat = $1;
}
/lat/{
# this is the header line, just print it
print;
}
运行为
awk -F, -f test.awk test.csv
我假设您的第三列具有整数值。