我有一个名为order.csv的文件,数据就像
"Company","New Add Date"
"ELECTRICAL INSULATION SUPPLIES","200212"
"AVIS BUDGET GROUP","201110"
"HONEYWELL AEROSPACE","201307"
"AVIS BUDGET GROUP","201110"
"MERCK SHARP & DOHME","199608"
"PHARMA-BIO SERV INC","200803"
"UPS STORE","200407"
"PROCTER & GAMBLE","200403"
"W HOLDING CO INC","200712"
"AVIS BUDGET GROUP","201110"
我想根据第二列的最后2个字符获取日期(月份的最后日期),为此我使用命令:
awk -F, 'BEGIN{A[01]="31";A[02]="28";A[03]="31";A[04]="30";A[05]="31";A[06]="30";A[07]="31";A[08]="31";A[09]="30";A[10]="31";A[11]="30";A[12]="31";}{ print $1, substr($2,2,6)A[substr($2,6,2)] }' order.txt
这是输出:
"Company" New Ad
"ELECTRICAL INSULATION SUPPLIES" 20021231
"AVIS BUDGET GROUP" 20111031
"HONEYWELL AEROSPACE" 201307
"AVIS BUDGET GROUP" 20111031
"MERCK SHARP & DOHME" 199608
"PHARMA-BIO SERV INC" 200803
"UPS STORE" 200407
"PROCTER & GAMBLE" 200403
"W HOLDING CO INC" 20071231
没有提取我的结果,我做错了什么。
答案 0 :(得分:2)
因为2月的天数取决于一年是否是闰年,所以每月的天数取决于月份和年份。
您可以使用以下gawk
(GNU awk)脚本来实现:
last_day.awk :
function days_per_month(year, month) {
date = year" "month" 31 00 00 00"
day = strftime("%d", mktime(date))
return 31-day%31
}
# On every line of input
{
year = substr($2,2,4)
month = substr($2,6,2)
last_day = days_per_month(year, month)
print $1, year""month""last_day
}
这样称呼:
gawk -F, -f last_day.awk order.csv
顺便说一句,由于使用gawk
和mktime()
strftime()
具体
答案 1 :(得分:1)
尝试遵循awk命令,你不需要通过硬编码索引值来创建数组,我们可以通过split命令本身创建它。请尝试以下方法:
awk -F'[",]' '
BEGIN{
split("31,28,31,30,31,30,31,31,30,31,30,31", month,",")
}
{
month[2]=((substr($5,1,4)%4+0)==0 && (substr($5,1,4)%100+0!=0)) || (substr($5,1,4)%400+0==0)?29:28;
val=substr($5,5,2)~/^0/?1:2;
print substr($0,1,length($0)-1)\
month[substr($0,length($0)-val,val)]\
substr($0,length($0))
}
' Input_file
这也将照顾闰月。
答案 2 :(得分:1)
$ cat tst.awk
BEGIN { FS=OFS="\"" }
NR>1 {
# Get the secs since epoch for the 1st of next month then subtract
# 1 days worth of seconds to get the last day of this month
nextMth = substr($4,5) % 12 + 1
year = substr($4,1,4) + (nextMth == 1 ? 1 : 0)
secs = mktime(year" "nextMth" 1 0 0 0") - 24*60*60
$4 = strftime("%Y%m%d",secs)
}
{ print }
$ awk -f tst.awk file
"Company","New Add Date"
"ELECTRICAL INSULATION SUPPLIES","20021231"
"AVIS BUDGET GROUP","20111031"
"HONEYWELL AEROSPACE","20130731"
"AVIS BUDGET GROUP","20111031"
"MERCK SHARP & DOHME","19960831"
"PHARMA-BIO SERV INC","20080331"
"UPS STORE","20040731"
"PROCTER & GAMBLE","20040331"
"W HOLDING CO INC","20071231"
"AVIS BUDGET GROUP","20111031"
答案 3 :(得分:-1)
对不起伙计,我只是犯了错误,现在我纠正了这个。我认为0被忽略了,现在我将这些键作为字符串
awk -F, 'BEGIN{A["01"]="31";A["02"]="28";A["03"]="31";A["04"]="30";A["05"]="31";A["06"]="30";A["07"]="31";A["08"]="31";A["09"]="30";A["10"]="31";A["11"]="30";A["12"]="31";}{ print $1, substr($2,2,6)A[substr($2,6,2)] }' order.txt