我不擅长unix。
我有一个csv文件,我有多个列。其中,一列包含新行和^M
个字符。我需要在~~
之间将两个“(这是单个单元格值)之间的所有内容替换掉,以便我可以将单元格值视为单个字段。以下是示例文件:
"id","notes"
"N001","this is^M
test.
Again test
"
"N002","this is perfect"
"N00345","this is
having ^M
problem"
我需要这样的文件:
"id","notes"
"N001","this is~~test.~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem"
这样可以将整个单元格值读取为单个字段值。
我需要在此要求中添加一个案例,其中单元格中的数据包含"
(双引号)。在这种情况下,我们可以识别结尾"
,后面跟着逗号。以下是更新的案例数据:
"id","notes"
"N001","this is^M
test. "Again test."
Again test
"
"N002","this is perfect"
"N00345","this is
having ^M
problem as it contains "
test"
我们可以保留"
或删除它。预期的输出是:
"id","notes"
"N001","this is~~test. "Again test."~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is ~~~~having ~~problem as it contains "~~test"
答案 0 :(得分:3)
尝试使用sed
sed -i -e 's/^M//g' -e '/"$/!{:a N; s/\n/~~/; /"$/b; ba}' file
注意:要输入^M
,请键入 Ctrl + V ,然后按 Ctrl + < KBD>中号
运行命令后的文件内容
"id","notes"
"N001","this is~~test.~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem"
或使用dos2unix
后跟sed
dos2unix file
sed -i '/"$/!{:a N; s/\n/~~/; /"$/b; ba}' file
简短说明
这里的想法是删除不以"
sed -i ' # -i specifies in-place relace i.e. modifies file itself
/"$/!{ # if a line doesn't contain end pattern, " at the end of a line, then do following
:a # label 'a' for branching/looping
N; # append the next line of input into the pattern space
s/\n/~~/; # replace newline character '\n' with '~~' i.e. suppress new lines
/"$/b; # if a line contains end pattern then branch out i.e. break the loop
ba # branch to label 'a' i.e. this will create loop around label 'a'
}
' file # input file name
有关详细信息,请参阅man sed
修改强>
有时单元格中的数据包含“在其中。
使用sed
sed -i ':a N; s/\n/~~/; $s/"~~"/"\n"/g; ba' file
运行更新案例数据的命令后的文件内容
"id","notes"
"N001","this is~~test. "Again test."~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem as it contains "~~test"
使用perl
单行
perl -0777 -i -pe 's/\n/~~/g; s/"~~("|$)/"\n$1/g;' file
答案 1 :(得分:2)
您可以使用sed
命令
单独替换'^ M'
sed -i 's|^M|~~|g' file_name
修改强> 谢谢你给予评论。
添加语句以替换'^ M和新行'
替换'^ M和新行'**
sed -i ':a;N;$!ba;s|^M\n|~~|g' file_name
要在控制台中获取“^ M”,您应该同时按Cntrl+v+m
答案 2 :(得分:0)
使用tr
。
$ tr '<Ctrl>+m' '~'
答案 3 :(得分:0)
sed 's/\^M/~~/;t nextline
b
: nextline
N
s/\n/~~/
s/^[^"]*\("[^"]*"\}\{1,\}[^"]*$
t
b nextline
"
不仅要更改 ^ M ,还要更改引号之间的新行。
^ M 在unix会话中获得 CTRL + V ,然后键盘上的 CTRL + M