删除文件中的换行符

时间:2017-09-25 12:03:27

标签: perl awk sed gawk tr

我有一个逗号分隔值的文本文件,其列值中包含换行符。因此,它会将列数据拆分为下一行,从而导致数据问题。

样本数据

"604","56-1203802","xx","VEN","null","50","1","20","N�
jTï"
"5526","841328305","yyINC","VEN","null","50","1","20","~R¿½K�ï
¿½ï¿½}("
"604","561203802","C","VEN",,"null","50","1","20","2ï½a��"

预期产出

"604","56-1203802","xx","VEN","null","50","1","20","N�jTï"
"5526","841328305","yyINC","VEN","null","50","1","20","~R¿½K���}("
"604","561203802","C","VEN",,"null","50","1","20","2ï½a��"

我需要删除双引号字符串中的换行符。

我尝试使用下面的awk命令删除它,但它没有按预期工作。

gawk -v RS='"' 'NR % 2 == 0 { gsub(/\n/, "") } { printf("%s%s", $0, RT) }' infile.txt > outfile.txt

所需的结果是从数据中删除LF和CR字符。

我尝试过发布类似问题的解决方案,但不适合我。

文件中的换行符不可见,除非在显示为CR LF时复制到Notepad ++。

2 个答案:

答案 0 :(得分:0)

要回答您的问题如何删除回车,您可以尝试按照简单的命令告诉我这是否对您有所帮助。

tr -d '\r' < Input_file > temp_file && mv temp_file Input_file

当Input_file偏离您的文件时,其中包含Input(数据)。如果这没有解决您的问题,那么请在代码标签中发布样本数据并编辑帖子。

编辑:由于OP更改了示例数据,因此现在也添加了以下代码。

awk '{printf("%s",!/^\"/?"":(NR==1?$0:RS $0))} END{print ""}'  Input_file

答案 1 :(得分:0)

您可以试试sed

sed ':loop; /" *$/!{N;s/\n//g; b loop}' file