如何从awk中提取特定字符串的最终字符并将其附加到列?

时间:2017-03-26 21:04:23

标签: string bash parsing awk

我有很多看起来像这样的数据文件: ,8/9/2015 Timezone,-6 , Serial No.,19000000395CCE41 Location:,LS_trap_9u High temperature limit (�C),20.12 Low temperature limit (�C),0.05 Date - Time,Temperature (�C) 5/28/2015 6:00,20 5/28/2015 8:00,22.6 5/28/2015 10:00,27.1 5/28/2015 12:00,26.1 5/28/2015 14:00,27.1 5/28/2015 16:00,26.1 5/28/2015 18:00,24.6 5/28/2015 20:00,23.6 5/28/2015 22:00,22.6 5/29/2015 0:00,22.1 我用这个脚本解析这些文件:

awk -vFS=, -vOFS=, \
   '{gsub("\"","")}
    FNR==4{s=$2}
    FNR==5{l=$2}
    FNR>8{gsub(" ",OFS);print l,s,FILENAME,$0}' \
   *.csv > formatted_log.csv
printf "\nDone\n"

我想从' loc'中提取最后一个字符。 string(在本例中为#34; u")并将其附加到另一列。

最终文件应如下所示:

LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,5:59,20.1
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,7:59,27.6
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,9:59,30.1
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,11:59,29.6
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,13:59,29.6
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,15:59,28.1
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,17:59,26.1
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,u,5/28/2015,19:59,23.6

到目前为止我的尝试看起来像这样:

awk -vFS=, -vOFS=, \
   '{gsub("\"","")}
    FNR==4{ser=$2}
    FNR==5{loc=$2}
    FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,${loc:(-1)},$0}' \
   *.csv > formatted_log.csv

我收到以下错误:

awk: cmd. line:4:     FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,${loc:(-1)},$0}
awk: cmd. line:4:                                                 ^ syntax error
awk: cmd. line:4:     FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,${loc:(-1)},$0}
awk: cmd. line:4:                                                           ^ syntax error
awk: cmd. line:4:     FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,${loc:(-1)},$0}
awk: cmd. line:4:                                                              ^ syntax error

将脚本更改为:

    awk -vFS=, -vOFS=, \
       awk -vFS=, -vOFS=, \
   '{gsub("\"","")}
    FNR==4{ser=$2}
    FNR==5{loc=$2}
  my_loc="${loc:(-1)}"
    FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,my_loc,$0}' \
   *.csv > formatted_log.CSV
printf "\nDone1\n"
awk -vFS=, -vOFS=, \
   '{gsub("\"","")}
    FNR==4{ser=$2}
    FNR==5{loc=$2}
  my_loc="${loc:(-1)}"
    FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,my_loc,$0}' \
   *.csv > formatted_log.CSV
printf "\nDone1\n"

在formattted_log.csv文件中添加不需要的额外行。看起来像这样:

LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,5/28/2015,5:59,20.1
5/28/2015 7:59,27.6
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,5/28/2015,7:59,27.6
5/28/2015 9:59,30.1
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,5/28/2015,9:59,30.1
5/28/2015 11:59,29.6
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,5/28/2015,11:59,29.6
5/28/2015 13:59,29.6
LS_trap_9c,3.6E+15,trap9c_3600000039654841_150809.csv,5/28/2015,13:59,29.6
5/28/2015 15:59,28.1

如何从awk中提取特定字符串的最终字符?

1 个答案:

答案 0 :(得分:1)

要提取AWk中的最后一个字符,您可以使用:

substr(var,length(var),1)

脚本将是:

awk -vFS=, -vOFS=, \
   '{gsub("\"","")}
   FNR==4{ser=$2}
   FNR==5{loc=$2}
   FNR>8{gsub(" ",OFS);print loc,ser,FILENAME,substr(loc,length(loc),1),$0}' \
   *.csv > formatted_log.csv

来自man awk:

  

substr(s,i [,n])
       从i开始返回s的最多n个字符的子字符串。如果省略n,则使用s的其余部分。