如何从复杂文件中获取字段

时间:2017-07-15 04:35:07

标签: linux parsing awk

文本文件:

$ cat filename.txt 
2017-07-15 00:00:27,000 NAME: THT TYPE: S {"cp":"R3"} 
2017-07-15 00:00:27,301 NAME: THT TYPE: S {"cp":"R3"} 
2017-07-15 00:00:26,993 NAME: THT TYPE: M {"bl":"t","cp":"R1","scp":"T5"}.

我尝试过的命令行:

$ cat filename.txt |awk '{print $1,$2,$4,$6,$7}'
2017-07-15 00:00:27,000 THT S {"cp":"R3"}
2017-07-15 00:00:27,301 THT S {"cp":"R3"}
2017-07-15 00:00:26,993 THT M {"bl":"t","cp":"R1","scp":"T5"}

所需的输出:

017-07-15 00,THT,S,R3 
017-07-15 00,THT,S,R3 
017-07-15 00,THT,M,R1

我认为我们可以使用“IF”,但我不知道在AWK中使用“IF”。

3 个答案:

答案 0 :(得分:1)

考虑到您的Input_file与此处显示的示例相同。如果是,那么请尝试关注awk并告诉我这是否对您有帮助。

awk -F'[ :{"]' 'NF>18{print substr($1,2),$2 s1 $7 s1 $10 s1 $21;next} {print substr($1,2),$2 s1 $7 s1 $10 s1 $16}' s1=","   Input_file

此处也添加非单线形式的解决方案。

awk -F'[ :{"]' 'NF>18{
                    print substr($1,2),$2 s1 $7 s1 $10 s1 $21;
                    next
                 }
                 {
                    print substr($1,2),$2 s1 $7 s1 $10 s1 $16
                 }
           ' s1=","  Input_file

答案 1 :(得分:1)

$ awk -v OFS=',' '{match($NF,/"cp":"[^"]+/); print substr($0,2,12), $4, $6, substr($NF,RSTART+6,RLENGTH-6)}' file
017-07-15 00,THT,S,R3
017-07-15 00,THT,S,R3
017-07-15 00,THT,M,R1

答案 2 :(得分:0)

gawk 解决方案:

awk '{ $7=gensub(/.*"cp":"([^"]+)".*/,"\\1","g",$7); 
       print substr($1,2)" "substr($2,1,2),$4,$6,$7  }' OFS=',' filename.txt

输出:

017-07-15 00,THT,S,R3
017-07-15 00,THT,S,R3
017-07-15 00,THT,M,R1
  • $7=gensub(/.*"cp":"([^"]+)".*/,"\\1","g",$7) - 捕获"cp"属性值