我在文本文件中有大量数据,这些数据一直给我一些问题。文件中的许多记录在记录之间都有换行符。例如,这就是我目前的数据:
30670169, Corvette, EL-P078675, EL-P078675, Chevrolet Corvette C6 Color Matching Millenium Yellow License Plate Frame, "Made from high-quality billet aluminum, this stylish license frame is custom painted to precisely match the color of your C6 Corvette.
It features an engraved OEM style nameplate. High-gloss finished will never rust.
12"" x 6"" in standard size. Includes color matched screw covers and hardware.
This is a special custom made item. It takes 10-15 business days to ship.
Brand new official licensed product."
这是应该如何阅读:
30670169, Corvette, EL-P078675, EL-P078675, Chevrolet Corvette C6 Color Matching Millenium Yellow License Plate Frame, "Made from high-quality billet aluminum, this stylish license frame is custom painted to precisely match the color of your C6 Corvette. It features an engraved OEM style nameplate. High-gloss finished will never rust. 12"" x 6"" in standard size. Includes color matched screw covers and hardware. This is a special custom made item. It takes 10-15 business days to ship. Brand new official licensed product."
我需要一种方法来删除换行符,只要它们被引号括起来。有人有什么想法吗?
答案 0 :(得分:1)
您可以在Excel中打开csv文件并删除换行符,如以下链接所示:http://www.excelblog.ca/remove-line-breaks-from-excel-cell/
您也可以在一个特定列上执行此操作。
答案 1 :(得分:1)
尝试使用Notepad ++的查找/替换功能。
查找
\r(?!\n)
替换为:
(空格)
您需要检查正则表达式复选框:
首先尝试更换几行(例如选择前80行),然后在选择中替换以查看。如果可行,您可以继续处理整个文件。
在上面\r
将匹配CR,\n
将匹配LF。 (?!\n)
是一个特殊群组,如果后面跟\r
',则表示“\n
不匹配。”
注意:我认为notepad ++有时不能正确替换,因此如果立即更换整个文件会导致问题,请尝试更换较小的批次。
我通常使用脚本来做这样的事情,但如果你不习惯,我认为你不准备使用脚本:s