正则表达式替换字符if是唯一的

时间:2015-05-15 12:38:28

标签: regex sed

我需要帮助一个带有正则表达式的脚本来修复linux下的大文本文件(例如sed)。我的记录如下:

1373350|Doe, John|John|Doe|||B|Acme corp|...
1323350|Simpson, Homer|Homer|Simpson|||3|Moe corp|...

我需要验证第7列是否具有唯一字符(可能是字母或数字),如果为true,则添加第二列不带逗号,我的意思是:

1373350|Doe, John|John|Doe|||B Doe John|Acme corp|...
1323350|Simpson, Homer|Homer|Simpson|||3 Simpson Homer|Moe corp|...

有任何帮助吗?谢谢!

1 个答案:

答案 0 :(得分:1)

Awk更适合这项工作:

awk -F '|' 'BEGIN { OFS = FS } length($7) == 1 { x = $2; sub(/,/, "", x); $7 = $7 " " x } 1' filename

那是:

BEGIN { OFS = FS }   # output separated the same way as the input
length($7) == 1 {    # if the 7th field is one character long
  x = $2             # make a copy of the second field
  sub(/,/, "", x)    # remove comma from it
  $7 = $7 " " x      # append it to seventh field
}
1                    # print line