我在文本文件中有如下示例所示的行:
"2009217",2015,3,"N","N","2","UPPER DARBY FIREFIGHTERS "PAC"","","","","7235 WEST CHESTER PIKE","","UPPER DARBY","PA","19082","","6106220269",4245.0100,650.0000,.0000
我想在整个文件中用类似于此"UPPER DARBY FIREFIGHTERS "PAC""
的多个部分字符串替换每个双引号。
因此对于重复双引号的每个实例,结果应如下所示:
"2009217",2015,3,"N","N","2","UPPER DARBY FIREFIGHTERS PAC","","","","7235 WEST CHESTER PIKE","","UPPER DARBY","PA","19082","","6106220269",4245.0100,650.0000,.0000
我来到了sed行:
cat file.txt | sed "s/\([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,\)\([^,]*\),\(.*\)/\1\2\3/"
但是现在我不知道如何替换\2
中的双引号。
sed
有可能吗?
答案 0 :(得分:2)
我个人会使用awk
,因为它更具可读性:
#!/usr/bin/env awk
BEGIN {
# Use ',' as the input and output field delimiter
FS=OFS=","
}
{
# Iterate through all fields. (NF is the number of fields.)
for(i=1;i<=NF;i++) {
# If the field starts and ends with a '"'
if($i ~ /^".*"$/) {
# Replace all '""
gsub(/"/,"",$i)
# Wrap in '"' again
$i = "\"" $i "\""
}
}
}
print
答案 1 :(得分:2)
这可能对您有用(GNU sed):
sed -r ':a;s/^((([^",]*,)*("[^",]*",([^",]*,)*)*)"[^",]*)"([^,])/\1\6/;ta' file
这将从用双引号括起来并以,
分隔的字符串中删除多余的双引号。
它通过消除正确构造的双引号字符串和未加引号的字符串(在此示例中为数字),然后删除不带,
[^",]*, # non double quoted strings
"[^",]*", # properly quoted strings
(([^",]*,)*("[^",]*",([^",]*,)*)*) # eliminate all properly constructed strings
"[^",]*"([^,]) # improper double quotes
^
|