输入文件:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
输出文件:
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
请帮助我使用Unix命令获得上述输出。请注意,输出中的第三栏已修改,所有内部双引号均已删除。
逗号是终止符。当双引号之间出现逗号时,则不将其视为终止符。参见第六行,第二逗号后,逗号在双引号之间以文本形式出现,这很好。
到目前为止,我已经尝试过:
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
答案 0 :(得分:1)
假设(第一和第二列是“干净的”,例如,它们不包含,
)
输入:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
命令:
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
输出:
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
说明:
tr -d '"' < input
将删除所有"
| awk
将输出通过管道传递到awk
-F',' -v OFS=','
输入/输出字段分隔符定义为逗号"
用$1="\""$1"\"";$2="\""$2"\"";
包围前两列,然后打印printf $1 OFS $2 OFS "\"";
for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}
对于该列的其余部分,您只需将它们重新附加在一起,然后在该行的末尾添加最后一个"
。出于可读性考虑:
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
答案 1 :(得分:0)
使用引号查找前两个字段,并连接其他字段。
awk -F '"' '
BEGIN {q="\""}
{printf "%s", q$2q$3q$4q$5q; for (i=6;i<=NF;i++) printf "%s", $i; print q}
' inputfile
编辑:替代方法
paste -d, <( cut -d"," -f1,2 < inputfile) \
<( cut -d"," -f3- < inputfile | sed 's/"//g;s/.*/"&"/')
编辑:另一种选择
sed 's/old/new/g
:将替换项应用于正则表达式的所有匹配项
sed
/旧/新/数字`:仅替换正则表达式的数字匹配项。
在GNU sed中混合使用g和number修饰符时,将忽略第一个字符,然后匹配并替换所有匹配项。
在这种情况下:
sed -r 's/"//g6;s/$/"/' inputfile