我想找到1月到12月特定月份CA,TX和AX三个州的平均降雨量。给定由TAB SPACES
分隔的输入文件并具有格式
city name, the state , and then average rainfall amounts from January through December, and then an annual average for all months
。 EG可能看起来像
AVOCA PA 30 2.10 2.15 2.55 2.97 3.65 3.98 3.79 3.32 3.31 2.79 3.06 2.51 36.18
BAKERSFIELD CA 30 0.86 1.06 1.04 0.57 0.20 0.10 0.01 0.09 0.17 0.29 0.70 0.63 5.72
我想要做的是“获得特定月份的平均降雨量之和,超过n年,然后找出CA,TX和AX州的平均值。
我在awk中编写了下面的脚本来做同样的事情,但它没有给我预期的输出
/^CA$/ {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only
/^TX$/ {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only
/^AX$/ {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only
END {
CA_avg = CA_SUM/CA;
TX_avg = TX_SUM/TX;
AX_avg = AX_SUM/AX;
printf("CA Rainfall: %5.2f",CA_avg);
printf("CA Rainfall: %5.2f",TX_avg);
printf("CA Rainfall: %5.2f",AX_avg);
}
我用命令调用程序
awk 'FS="\t"'-f awk1.awk rainfall.txt
并且看不到输出。
问题:我在哪里滑倒?任何建议和更改的代码将不胜感激
答案 0 :(得分:3)
模式/^CA$/
表示字符“C”和“A”是该行中唯一的字符。你想要:
$2 == "CA" {CA++; CA_SUM+= $5}
# etc.
然而,这是DRYer:
{ count[$2]++; sum[$2] += $5 }
END {
for (state in count) {
printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
}
}
此外,这看起来不对:awk 'FS="\t"'-f awk1.awk rainfall.txt
尝试:awk -F '\t' -f awk1.awk rainfall.txt
对评论的回应:
awk -F '\t' -v month=2 -v states="CA,AZ,TX" '
BEGIN {
month_col = month + 3 # assume January is month 1
split(states, wanted_states, /,/)
}
{ count[$2]++; sum[$2] += $month_col }
END {
for (state in wanted_states) {
if (state in count) {
printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
else
print state " Rainfall: no data"
}
}
' rainfall.txt
答案 1 :(得分:2)
你的正则表达式应该是
/ CA / {CA++; cA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only
/ ^ AX $ /仅当它是行
中的唯一单词时才匹配HTH!
修改
/ CA / {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only
END {
if(CA!=0){CA_avg = CA_SUM/CA; printf("CA Rainfall: %5.2f",CA_avg);}
if(TX!=0){TX_avg = TX_SUM/TX; printf("TX Rainfall: %5.2f",TX_avg);}
if(AX!=0){TX_avg = AX_SUM/CA; printf("AX Rainfall: %5.2f",AX_avg);}
}