进行摘要 我有一个CSV文件,转换为.DAT。我有一个AWK文件,它假设要进行DAT文件的映射。 AWK文件中的代码如下所示。
DAT文件的内容如下所示(制表符分隔):
ODT AGE CDT CO SEX TIME VALUE COMMENT
P3 Y6-8 ACT FG F 2011 1297
P4 Y3-4 EMP FG M 2011 6940 bd
P1 Y7-9 GRT FG F 2011 0 c
我需要做的是:
未完成
4. if the VALUE is ":" then NUMB is null
if the VALUE is ":" and COMMENT "c" then NUMB is null and STRING_COM is "c"
if the VALUE is ":" and COMMENT "u" then NUMB is null and STRING_STATUS is "u"
if the VALUE is "14,385" and COMMENT "d" then NUMB is "14385" and STRING(both) is null
if the VALUE is "14,385" and COMMENT "du" then NUMB is "14385" and STRING_STATUS is "u"
if the VALUE is ":" and COMMENT "cd" then NUMB is null and STRING_COM is "c"
if the VALUE is ":" and COMMENT "bc" then NUMB is null and STRING_COM is "c" and STRING_STATUS is "b"
if the VALUE is ":" and COMMENT "z" then NUMB is 0 and STRING_STATUS is "z"
awk代码:
BEGIN {
FS=","; OFS="\t";
a["ODT"]=1;a["AGE"]=1;a["CDT"]=1;a["CO"]=1;
a["SEX"]=1;a["TIME"]=1;a["VALUE"]=1;a["COMMENT"]=1;
}
NR==1 {
{ $a["VALUE"] = "NUMB" ; $a["COMMENT"] = "STRING_COM" ; $9 = "STRING_STATUS" ; print ; next }
$a["VALUE"]=="14,385" && $a["COMMENT"] == "d" { $a["VALUE"] = "14385" ; $a["COMMENT"] = $9 = "" }
$a["VALUE"]=="14,385" && $a["COMMENT"] == "du" { $a["VALUE"] = "14385" ; $a["COMMENT"] = "" ; $9 = "u" }
$a["VALUE"] != ":" { print ; next }
$a["COMMENT"] == "z" { $a["VALUE"] = "0" ; $a["COMMENT"] = "" ; $9 = "z" }
$a["COMMENT"] != "z" { $a["VALUE"] = "" }
$NF=substr($NF,1,length($NF)-1);
for(i=1;i<=NF;i++) if($i in a) a[$i]=i;
}
{ print $a["ODT"],$a["AGE"],$a["CDT"],$a["CO"],$a["SEX"],$a["TIME"],NR==1?"NUMB":$a["VALUE"],
NR==1?"STRING_COM"OFS"STRING_STATUS":($a["COMMENT"]?""OFS$a["COMMENT"]:$a["COMMENT"]);
}
有谁知道如何解决第4点?
预期结果应为
csv输入
ODT AGE CDT CO SEX TIME NUMB COMMENT
P3 Y6-8 AWT EE F 2011 1297
P4 Y3-4 ESP RR M 2011 6940 cd
P1 Y7-9 UDK FF F 2011 : du
PL Y3-9 EUP SS F 2011 : d
P9 Y_5 ACT DD F 2011 : cd
P6 Y5-9 UAK DF M 2011 : z
ODT AGE CDT CO SEX TIME NUMB STRING_COM STRING_STATUS
P3 Y6-8 AWT EE F 2011 1297
P4 Y3-4 ESP RR M 2011 6940 c
P1 Y7-9 UDK FF F 2011 u
PL Y3-9 EUP SS F 2011
P9 Y_5 ACT DD F 2011 c
P6 Y5-9 UAK DF M 2011 0 z
提前谢谢
我已根据您的建议更新了代码,但它无效。只有错误 这是你的意思吗?
答案 0 :(得分:1)
我将采取的一般方法是添加一些条件块,也适用于已经实现的规则。
BEGIN {
FS=","; OFS="\t";
}
NR==1 { $7 = "NUMB" ; $8 = "STRING_COM" ; $9 = "STRING_STATUS" ; print ; next }
$7=="14,385" && $8 == "d" { $7 = "14385" ; $8 = $9 = "" }
$7=="14,385" && $8 == "du" { $7 = "14385" ; $8 = "" ; $9 = "u" }
$7 != ":" { print ; next }
$8 == "z" { $7 = "0" ; $8 = "" ; $9 = "z" }
$8 != "z" { $7 = "" }
...
{ print }
可能缺少你的代码已经解决的东西,而且我还没有完全掌握,但这就是我构建脚本的精神。
假设数组a
应该适应具有混乱的字段顺序的输入,您可以