我是编程的新手。我试图在txt文件中解析并格式化'破损'行(文件中的流氓lf而不是\ cr \ lf windows格式)。使用python 3.4并阅读这些类型的帖子我已经设法读取源文件并创建一个文件,其中仅包含已损坏的行,并删除了所有lf,因此它的一条长行。现在我需要阅读该行并计算这种格式为'< |>'的分隔符并且在第36个之后添加换行符然后继续计算下一个36并添加换行符等。我尝试了一些不同的东西但是因为我不确定是否需要.tell()然后.seek()来插入\ n。有关如何在第36个分隔符之后插入换行符的建议吗?
my_count = 36 # define the number of delimiters to count
LineNumber = 1 # define line counter
FileName = 'Broken_Registrations.txt' # variable to define filename
target = open('Target.txt','w',encoding='utf-8') # open a file to write fixed lines
with open(FileName,encoding="utf8") as file:
for line in file: # open file read
cnt=line.count('<|>') # count delimiters
if cnt == mycount: # count until mycount then
target.write(line).append("\n") # write line and append new line char
print('DONE!') # let me know when you finished
target.close() # close the file opened outside of the with
答案 0 :(得分:0)
好吧我管理它,它一直很简单,虽然可能有更有效的方法来做到这一点但这对我有用
#import pdb
#pdb.set_trace()
my_count = 36
LineNumber = 1 # define line counter
FileName = 'Broken_Registrations.txt' # variable to define filename
target = open('Target.txt','w',encoding='utf-8') # open a file to write fixed lines
with open(FileName,encoding="utf8") as file:
for line in file: # open file read
cnt=line.count('<|>') # count delimiters
if cnt == my_count: # count until mycount then
line = line.rstrip() # remove whitespace
target.write(line +"\n") # write line and append new line char
print('DONE!') # let me know when you finished
target.close() # close the file opened outside of the with