这是我的imput txt文件
2013121612,HCDC,0
2013121613,HCDC,84
2013121614,HCDC,100
2013121615,HCDC,98
2013121612,MSLP,1023.83
2013121613,MSLP,1023.02
2013121614,MSLP,1022.08
2013121615,MSLP,1021.61
2013121612,MAXT,12.723
2013121613,MAXT,13.412
2013121614,MAXT,13.41
2013121615,MAXT,12.482
这是我的BAD或INSUFFICIENT代码
awk -F"," '/MAXT|HCDC|MSLP/ {print $1,"\t",$3,"\t",$3,"\t",$3}' input.txt >> ouput.txt
这是de output file
DATE MAXT HCDC MSLP
2013121612 0 0 0
2013121613 84 84 84
2013121614 100 100 100
2013121615 98 98 98
2013121612 1023.03 1023.03 1023.03
2013121613 1023.02 1023.02 1023.02
2013121614 1022.08 1022.08 1022.08
2013121615 1020.84 1020.84 1020.84
2013121612 12.723 12.723 12.723
2013121613 13.412 13.412 13.412
2013121614 13.41 13.41 13.41
2013121615 12.482 12.482 12.482
我需要的是这种输出格式......
DATE MAXT HCDC MSLP
2013121612 12.723 0 1023.03
2013121613 13.412 84 1023.02
2013121614 13.41 100 1022.08
2013121615 12.482 98 1020.84
我被迫寻求帮助,因为我对unix的了解很少
非常感谢
答案 0 :(得分:2)
这里有一个问题:
awk -F, '
{
key[$1] = 1
data[$1,$2] = $3
}
END {
print "DATE","MAXT","HCDC","MSLP"
for (k in key)
print k, data[k,"MAXT"], data[k,"HCDC"], data[k,"MSLP"]
}
' input.txt | column -t
DATE MAXT HCDC MSLP
2013121612 12.723 0 1023.83
2013121613 13.412 84 1023.02
2013121614 13.41 100 1022.08
2013121615 12.482 98 1021.61
因为我使用了关联数组,所以无法保证键的顺序。如果你需要对输出进行排序,那么像这样的bash代码:
{
echo DATE MAXT HCDC MSLP
awk -F, '
{ key[$1] = 1; data[$1,$2] = $3 }
END { for (k in key) print k, data[k,"MAXT"], data[k,"HCDC"], data[k,"MSLP"] }
' input.txt | sort
} | column -t
答案 1 :(得分:1)
你基本上试图转动表格,使用两列重新整形。您可以使用专门的语言(R非常擅长此类任务)。 awk
不是这类工作的最佳语言(虽然肯定可以使用它)。我建议用Python重写它,这可能会更容易一些。代码的大纲(没有错误检查等)如下:
tbl = {} # map date to a dict of colname->values
# ingest the data
for line in myfile:
rec = line.split()
if rec[0] not in tbl:
tbl[rec[0]] = {}
tbl[rec[0]][rec[1]] = double(rec[2])
# output the table
for date in tbl:
print date, tbl[date]['MAXT'], tbl[date]['HCDC'], tbl[date]['MSLP']
请注意,使用NumPy可能更容易(实际上是两行),但我不确定是否值得将此作为依赖于这么小的任务。
答案 2 :(得分:1)
awk -F, '!($1 in seen){dr[++i]=$1};{d=$1; v=$3; $0=$2; seen[d]++};
/HCDC/{HCDC[d]=v}; /MSLP/{MSLP[d]=v};/MAXT/{MAXT[d]=v};
END{print "DATE", "MAXT", "HCDC", "MSLP";
for (j=1; j<=i; ++j) {print dr[j], (dr[j] in MAXT)? MAXT[dr[j]]: 0,
(dr[j] in HCDC)? HCDC[dr[j]]: 0,
(dr[j] in MSLP)? MSLP[dr[j]]: 0}}' input.txt
DATE MAXT HCDC MSLP
2013121612 12.723 0 1023.83
2013121613 13.412 84 1023.02
2013121614 13.41 100 1022.08
2013121615 12.482 98 1021.61