如何在Unix中用~~替换新行和^ M字符

时间:2013-11-22 10:02:51

标签: file unix replace sed newline

我不擅长unix。

我有一个csv文件,我有多个列。其中,一列包含新行和^M个字符。我需要在~~之间将两个“(这是单个单元格值)之间的所有内容替换掉,以便我可以将单元格值视为单个字段。以下是示例文件:

"id","notes"
"N001","this is^M
test.

Again test

"
"N002","this is perfect"
"N00345","this is

having ^M
problem"

我需要这样的文件:

"id","notes"
"N001","this is~~test.~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem"

这样可以将整个单元格值读取为单个字段值。

我需要在此要求中添加一个案例,其中单元格中的数据包含"(双引号)。在这种情况下,我们可以识别结尾",后面跟着逗号。以下是更新的案例数据:

"id","notes"
"N001","this is^M
test. "Again test."

Again test

"
"N002","this is perfect"
"N00345","this is

having ^M
problem as it contains "
test"

我们可以保留"或删除它。预期的输出是:

"id","notes"
"N001","this is~~test. "Again test."~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is ~~~~having ~~problem as it contains "~~test"

4 个答案:

答案 0 :(得分:3)

尝试使用sed

sed -i -e 's/^M//g' -e '/"$/!{:a N; s/\n/~~/; /"$/b; ba}' file

注意:要输入^M,请键入 Ctrl + V ,然后按 Ctrl + < KBD>中号

运行命令后的文件内容

"id","notes"
"N001","this is~~test.~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem"

使用dos2unix后跟sed

dos2unix file
sed -i '/"$/!{:a N; s/\n/~~/; /"$/b; ba}' file

简短说明

这里的想法是删除不以"

结尾的每一行中的换行符
sed -i '              # -i specifies in-place relace i.e. modifies file itself
  /"$/!{              # if a line doesn't contain end pattern, " at the end of a line, then do following
    :a                # label 'a' for branching/looping
      N;              # append the next line of input into the pattern space 
      s/\n/~~/;       # replace newline character '\n' with '~~' i.e. suppress new lines
      /"$/b;          # if a line contains end pattern then branch out i.e. break the loop
      ba              # branch to label 'a' i.e. this will create loop around label 'a'
  }                   
' file                # input file name

有关详细信息,请参阅man sed


修改

  

有时单元格中的数据包含“在其中。

使用sed

sed -i ':a N; s/\n/~~/; $s/"~~"/"\n"/g; ba' file

运行更新案例数据的命令后的文件内容

"id","notes"
"N001","this is~~test. "Again test."~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem as it contains "~~test"

使用perl 单行

perl -0777 -i -pe 's/\n/~~/g; s/"~~("|$)/"\n$1/g;' file

答案 1 :(得分:2)

您可以使用sed命令

执行此操作

单独替换'^ M'

sed -i 's|^M|~~|g' file_name

修改 谢谢你给予评论。

添加语句以替换'^ M和新行'

替换'^ M和新行'**

sed -i ':a;N;$!ba;s|^M\n|~~|g' file_name

要在控制台中获取“^ M”,您应该同时按Cntrl+v+m

答案 2 :(得分:0)

使用tr

$ tr '<Ctrl>+m' '~'

答案 3 :(得分:0)

sed 's/\^M/~~/;t nextline
b
: nextline
N
s/\n/~~/
s/^[^"]*\("[^"]*"\}\{1,\}[^"]*$
t
b nextline
"

不仅要更改 ^ M ,还要更改引号之间的新行。

^ M 在unix会话中获得 CTRL + V ,然后键盘上的 CTRL + M