我需要帮助一个带有正则表达式的脚本来修复linux下的大文本文件(例如sed)。我的记录如下:
1373350|Doe, John|John|Doe|||B|Acme corp|...
1323350|Simpson, Homer|Homer|Simpson|||3|Moe corp|...
我需要验证第7列是否具有唯一字符(可能是字母或数字),如果为true,则添加第二列不带逗号,我的意思是:
1373350|Doe, John|John|Doe|||B Doe John|Acme corp|...
1323350|Simpson, Homer|Homer|Simpson|||3 Simpson Homer|Moe corp|...
有任何帮助吗?谢谢!
答案 0 :(得分:1)
Awk更适合这项工作:
awk -F '|' 'BEGIN { OFS = FS } length($7) == 1 { x = $2; sub(/,/, "", x); $7 = $7 " " x } 1' filename
那是:
BEGIN { OFS = FS } # output separated the same way as the input
length($7) == 1 { # if the 7th field is one character long
x = $2 # make a copy of the second field
sub(/,/, "", x) # remove comma from it
$7 = $7 " " x # append it to seventh field
}
1 # print line