所以我有一个项目,我的程序读取一个txt文件,然后将txt转换为tab txt文件(基本上它读取输入,并且使用dicionary,当它找到一个特殊字符时,它会插入一个标签'\ T')。该程序工作正常(到目前为止)但它只读取第一行,直到找到换行符'\ n',我只是无法理解我的代码中出现的错误。有人可以告诉我我的代码失败了吗?
代码:
from Tkinter import *
import Tkinter as tk
import codecs
from string import *
u'\xe1'.encode('utf-8')
root = tk.Tk()
root.title('Tentative 1')
file = open('Data Path', 'r+')
#sentence = file.read()
#sentence = sentence.decode('cp1252', 'strict')
with codecs.open('Data path', encoding='latin1') as f:
sentence = f.readline()
if u'\xe1' in sentence:
print sentence
else:
pass
#sentence = sentence.replace("u'\xe1'", "-")
def task():
print '\n', sentence
def replace_all(text, dic):
for i, j in dic.iteritems():
text = text.replace(i, j)
return text
reps = {'^^':'\t', '(':'\t', ')':'\t', 'ISBN:':'\t', '--':'\t', '"':'\t', '.:':'\t', '|':'\t', 'p.':'\t', ',':' '}
txt = replace_all(sentence, reps)
def txt_conversor():
txt = replace_all(sentence, reps)
print '\n', txt
results = tk.Button(root, text='Results', width=25, command=task)
results.pack()
txt = tk.Button(root, text='Convert results', width=25, command=txt_conversor)
txt.pack()
root.mainloop()
我确实尝试将f.readline()更改为f.readlines(),但它出错了:
Traceback (most recent call last):
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\framework\scriptutils.py", line 323, in RunScript
debugger.run(codeObject, __main__.__dict__, start_stepping=0)
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\debugger\__init__.py", line 60, in run
_GetCurrentDebugger().run(cmd, globals,locals, start_stepping)
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\debugger\debugger.py", line 654, in run
exec cmd in globals, locals
File "C:\Users\Joao\Desktop\Script (Console Bug with conversor).py", line 6, in <module>
import sys
File "C:\Users\Joao\Desktop\Script (Console Bug with conversor).py", line 233, in replace_all
text = text.replace(i, j)
AttributeError: 'list' object has no attribute 'replace'
那么如何从txt文件中读取多行?
Input:
Correia, Teresa Pinto; Henriques, Virgínia; Julião, Rui Pedro^^ (2013)), IX Congresso da Geografia Portuguesa – Geografia: Espaço, Natureza, Sociedade e Ciência--, ISBN: 978-972-99436-6-9, |Lisboa: Associação Portuguesa de Geógrafos. p. 977 e-Book
Dominguez, L.; Aliste, J; Ibáñez Martinez; Natário, M.; Fernandes, Gonçalo Poeta (2013) – Estudio Socioeconomico de la Frontera entre Portugal y España, Edita Riet, Salamanca. (ISBN: 978-84-7797-403-1)
输出:
Correia Teresa Pinto; Henriques Virgínia; Julião Rui Pedro 2013 IX Congresso da Geografia Portuguesa Geografia: Espaço Natureza Sociedade e Ciência 978-972-99436-6-9 Lisboa: Associação Portuguesa de Geógrafos. 977 e-Book
答案 0 :(得分:1)
f.readline()
只返回文件的下一行,包括换行符,因此在处理完第一行后它会正确停止。
f.readlines()
返回一个字符串列表,其中每个字符串对应于文件中的一行。您的问题是您在列表对象(sentence
)上使用字符串方法。
为了修复你可以使用read()
,它将文件的整个内容作为字符串返回(它可能是最pythonic的解决方案)或确保将列表传递给replace_all
函数(逐个处理列表项,使用.join()
,等等。)
在这里,您可以找到不同文件方法的一个很好的解释:http://interactivepython.org/runestone/static/thinkcspy/Files/files.html#filemethods2a
答案 1 :(得分:0)
使用f.read()
将所有文字作为一个字符串(所有文字都在内部)。
sentence = f.read()
答案 2 :(得分:0)
阅读文件的最佳方式。
with open(filename, "r") as file:
for i in file.readlines():
print(i)
# end with (closes the file)