您好我有以下CSV输入数据,其中包含多个换行符和回车符。我试图用SED清理文件:
<div>Confirm:</div>
<div>
<select id="status" name="status" class="form-control" required>
<option value="1">room2</option>
<option value="2">rrom2</option>
</select>
</div>
<div>Status:</div>
<div>
<select id="status" name="status" class="form-control" >
<option value="1">Confirm booking</option>
<option value="2">Cancel booking</option>
</select>
</div>
注意: CR和LF等于实际\ r和<\ n
我想替换所有没有前置的换行符 - 在这里导入双引号字符以供考虑。我设法过滤掉所有换行但不知道如何告诉SED忽略具有特定模式的那些。
预计输出如下:
"Data1","This<LF>
Is<LF>
Foobar"<CR><LF>
"Data2","Additional<LF>
Data<CR><LF>
With Inline CR LF<CR><LF>
End of Data."<CR><LF>
有什么想法吗?
答案 0 :(得分:1)
您可以使用此gnu awk
代替\r
代替<CR>
代替\n
,而不是<LF>
:
awk -v BINMODE=3 -v RS='"\r\n"' 's!=""{printf "%s\"\n\"", s} {
s = $0; gsub(/\r?\n/, " ", s)} END{print s}' file
"Data1","This Is Foobar"
"Data2","Additional Data Width Inline CR LF End of Data."
答案 1 :(得分:0)
将GNU awk用于多字符RS和RT:
$ cat tst.awk
BEGIN { RS="\"[^\"]*\"" }
RT != "" {
gsub(/\r/,"")
gsub(/[\r\n]+/," ",RT)
printf "%s%s", $0, RT
}
END { print "" }
$ awk -f tst.awk file
"Data1","This Is Foobar"
"Data2","Additional Data With Inline CR LF End of Data."