用python在大文件中用CRLF替换CRLFCRLF

时间:2018-09-10 14:53:44

标签: python-2.7 file replace carriage-return linefeed

我有大型txt文件,碰巧组合了CRLFCRLF作为行尾。 我必须将其更改为CRLF才能使用此文件。 文本编辑器替换和文本编辑器Makros花费的时间太长,因为文件为8 GB。 如何使用Python 2.7做到这一点?我尝试了以下操作,但它不会更改文件。当我尝试使用键盘琴弦时,例如replace('a','A')replace('BUS','CAR'),它可以正常工作:

f1 = open('C:/temp/Textfile1.txt', 'r')
f2 = open('C:/temp/Textfile2.txt', 'w')
string = f1.read()
string = string.replace('\r\n\r\n','\r\n')
f2.write(string)
f1.close()
f2.close()

1 个答案:

答案 0 :(得分:0)

使用正则表达式尝试:

fn = "t.txt"
fn2= "r.txt"

print '-'*70
with open(fn,"w") as f:
    f.write("ta\r\ntata\r\n\r\ntata\r\n\r\n\r\nta\r\ntaa\r\n\r\n\r\n\r\ntata")

with open(fn,"r") as f:
    print(f.read())

import re
with open(fn,"r") as f:
    t = f.read()

subbed = re.sub(r"\r\n\r\n", r"\r\n", t)
with open(fn2,"w") as f:
    f.write(subbed)

print '-'*70
with open(fn2,"r") as f:
    print(f.read())

输出:

----------------------------------------------------------------------
ta
tata

tata


ta
taa



tata
----------------------------------------------------------------------
ta
tata
tata

ta
taa

tata

旁注:

如果在Linux上,请使用subbed = re.sub(r"\n\n", r"\n", t)