使用SED替换csv文件中的前导和尾随spces

时间:2019-01-02 12:49:46

标签: sed

我正在使用以下命令从文件A.csv中删除前导空格和尾随空格

sed "s/^  \+//g;s/[ \t]*$//;s/ \{1,\}/ /g" <A.csv> B.csv

以下是A.csv的示例

"a","  v b","z"
"a","   vd","z"
"a","  v, b, c ","z  "
"a","  vb ","z   "

问题在于,并非如下所示删除了所有前导和尾随空格:

"a"," v b","z"
"a"," vd","z"
"a"," v, b, c ","z "
"a"," vb ","z "

以下是我所期望的示例:

"a","v b","z"
"a","vd","z"
"a","v, b, c","z"
"a","vb","z"

我该如何正确处理?

3 个答案:

答案 0 :(得分:0)

sed 's/" \+/"/g;s/[ \t]*"/"/g;s/ \{1,\}/ /g' A.csv

输出:

"a","v b","z"
"a","vd","z"
"a","v, b, c","z"
"a","vb","z"

您自己的命令,只有s/ \{1,\}/ /g在起作用。
事实是,sed会将csv文件视为一个简单的文本文件,而无需知道逗号和引号是用于列的。
因此^$仅与每行的开头和结尾匹配。
另外,您忘记将g放在第二个s上。

答案 1 :(得分:0)

sed就不能/不应该正确地做到这一点。我建议切换到可以与CSV文件一起使用的更好的语言。

还有一个名为csvtool的工具:

$ cat /path/to/trim
#!/usr/bin/env bash
shopt -s extglob
for c; do
    c=${c##*([[:space:]])} c=${c%%*([[:space:]])}
    printf '"%s"\n' "${c//'"'/'""'}"
done | paste -sd,

$ csvtool call /path/to/trim A.csv
"a","v b","z"
"a","vd","z"
"a","v, b, c","z"
"a","vb","z"

就我喜欢简单的东西csvtool而言,很不幸,这会非常痛苦!我的VBox花了将近15秒的时间来处理简短的4000行CSV

答案 2 :(得分:0)

这可能对您有用(GNU sed):

sed -r 's/"\s*([^[:space:]"]+(\s*[^[:space:]"]+)*)\s*"/"\1"/g' file

在整个文件中全局删除一对双引号两侧的立即空格。