Question

我有以下文字：

Matt has 11 eggs and they are brown
Helen has 23 ducks and they are black and brown
Todd has 34 quarters and they are silver
Bud has 45 pens and they are red, yellow, "greenish" and blue

当我运行以下sed命令时：

sed -E "s/([^ ]+) has ([^ ]+) ([^ ]+) and they are (.*)/\"\1\",\"\2\",\"\3\",\"\4\"/" input

我得到这个CSV：

"Matt","11","eggs","brown"
"Helen","23","ducks","black and brown"
"Todd","34","quarters","silver"
"Bud","45","pens","red, yellow, "greenish" and blue"

但我真正想要的是（引用正确转义）：

"Matt","11","eggs","brown"
"Helen","23","ducks","black and brown"
"Todd","34","quarters","silver"
"Bud","45","pens","red, yellow, \"greenish\" and blue"

我该如何做到这一点？

Answer 1

尝试：

sed -E 's/"/\\"/g; 
  s/([^ ]+) has ([^ ]+) ([^ ]+) and they are (.*)/"\1","\2","\3","\4"/' input

首先用"替换所有\"个实例，然后执行原始命令。请注意在sed程序周围使用单个引号如何使其更具可读性。

Answer 2

这可能适合你（GNU sed）：

sed -r 's/"/\\&/g;s/^\\"|\\(",)\\"|\\"$/\1"/g'  file

将所有"转换为\"，然后从开头，结尾和中间删除\。

在sed反向引用中转义双引号替换

2 个答案: