我有一个定界文件,我试图用或替换逗号。除非逗号(和其他文本)在引号(“)
之间我知道我可以使用sed的's /,/ | / g'文件名替换逗号,但是我不确定如何将引号之间的文本作为规则的例外。或者,即使有可能,也很容易。
答案 0 :(得分:0)
如在这里推荐的人一样,最好和最安全的方法是使用适当的模块/库将csv阅读为csv。
无论如何,如果您想在这里唱歌,那就是
sed -i 's/|//g;y/,/|/;:r;s/\("[^"]*\)|\([^"]*"\)/\1,\2/g;tr' file.csv
程序:
测试:
$ cat file.csv
aaa,1,"what's up"
bbb,2,"this is pipe | in text"
ccc,3,"here is comma, in text"
ddd,4, ",, here a,r,e multi, commas,, ,,"
"e,e",5,first column
$ cat file.csv | sed 's/|//g;y/,/|/;:r;s/\("[^"]*\)|\([^"]*"\)/\1,\2/g;tr'
aaa|1|"what's up"
bbb|2|"this is pipe in text"
ccc|3|"here is comma, in text"
ddd|4| ",, here a,r,e multi, commas,, ,,"
"e,e"|5|first column
$ cat file.csv | sed 's/|//g;y/,/|/;:r;s/\("[^"]*\)|\([^"]*"\)/\1,\2/g;tr' | awk -F'|' '{ print NF }'
3
3
3
3
3
答案 1 :(得分:0)
您可以尝试此sed:
sed ':A;s/\([^"]*"[^"]*"\)\([^"]*\)\(,\)/\1|/;tA' infile
答案 2 :(得分:0)
使用GNU awk,FPAT
和@Kubator的示例文件:
$ awk '
BEGIN {
FPAT="([^,]+)|( *\"[^\"]+\" *)" # define the field pattern, notice the space before "
OFS="|" # output file separator
}
{
$1=$1 # rebuild the record
}1' file # output
aaa|1|"what's up"
bbb|2|"this is pipe | in text"
ccc|3|"here is comma, in text"
ddd|4| ",, here a,r,e multi, commas,, ,,"
"e,e"|5|first column