Question

所以我有一个项目，我的程序读取一个txt文件，然后将txt转换为tab txt文件（基本上它读取输入，并且使用dicionary，当它找到一个特殊字符时，它会插入一个标签'\ T'）。该程序工作正常（到目前为止）但它只读取第一行，直到找到换行符'\ n'，我只是无法理解我的代码中出现的错误。有人可以告诉我我的代码失败了吗？

代码：

from Tkinter import *
import Tkinter as tk
import codecs
from string import *


u'\xe1'.encode('utf-8')

root = tk.Tk()
root.title('Tentative 1')

file = open('Data Path', 'r+')

#sentence = file.read()
#sentence = sentence.decode('cp1252', 'strict')


with codecs.open('Data path', encoding='latin1') as f:
sentence = f.readline()


if u'\xe1' in sentence:
 print sentence

else:
 pass
#sentence = sentence.replace("u'\xe1'", "-")

def task():
 print '\n', sentence

def replace_all(text, dic):
 for i, j in dic.iteritems():
    text = text.replace(i, j)
 return text
reps = {'^^':'\t', '(':'\t', ')':'\t', 'ISBN:':'\t', '--':'\t', '"':'\t', '.:':'\t', '|':'\t', 'p.':'\t', ',':' '}
txt = replace_all(sentence, reps)


def txt_conversor():
 txt = replace_all(sentence, reps)
 print '\n', txt

results = tk.Button(root, text='Results', width=25, command=task)
results.pack()
txt = tk.Button(root, text='Convert results', width=25, command=txt_conversor)
txt.pack()

root.mainloop()

我确实尝试将f.readline（）更改为f.readlines（），但它出错了：

Traceback (most recent call last):
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\framework\scriptutils.py", line 323, in RunScript
debugger.run(codeObject, __main__.__dict__, start_stepping=0)
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\debugger\__init__.py", line 60, in run
_GetCurrentDebugger().run(cmd, globals,locals, start_stepping)
File "C:\Python27\Lib\site-packages\Pythonwin\pywin\debugger\debugger.py", line 654, in run
exec cmd in globals, locals
File "C:\Users\Joao\Desktop\Script (Console Bug with conversor).py", line 6, in <module>
import sys
File "C:\Users\Joao\Desktop\Script (Console Bug with conversor).py", line 233, in replace_all
text = text.replace(i, j)
AttributeError: 'list' object has no attribute 'replace'

那么如何从txt文件中读取多行？

Input:
Correia, Teresa Pinto; Henriques, Virgínia; Julião, Rui Pedro^^ (2013)), IX Congresso da      Geografia Portuguesa – Geografia: Espaço, Natureza, Sociedade e Ciência--, ISBN: 978-972-99436-6-9, |Lisboa: Associação Portuguesa de Geógrafos. p. 977 e-Book 


Dominguez, L.; Aliste, J; Ibáñez Martinez; Natário, M.; Fernandes, Gonçalo Poeta (2013) – Estudio Socioeconomico de la Frontera entre Portugal y España, Edita Riet, Salamanca. (ISBN: 978-84-7797-403-1)

输出：

 Correia  Teresa Pinto; Henriques  Virgínia; Julião  Rui Pedro      2013          IX Congresso da Geografia Portuguesa  Geografia: Espaço  Natureza  Sociedade e Ciência        978-972-99436-6-9      Lisboa: Associação Portuguesa de Geógrafos.      977 e-Book

Answer 1

f.readline()只返回文件的下一行，包括换行符，因此在处理完第一行后它会正确停止。

f.readlines()返回一个字符串列表，其中每个字符串对应于文件中的一行。您的问题是您在列表对象（sentence）上使用字符串方法。

为了修复你可以使用read()，它将文件的整个内容作为字符串返回（它可能是最pythonic的解决方案）或确保将列表传递给replace_all函数（逐个处理列表项，使用.join()，等等。）

在这里，您可以找到不同文件方法的一个很好的解释：http://interactivepython.org/runestone/static/thinkcspy/Files/files.html#filemethods2a

Answer 2

使用f.read()将所有文字作为一个字符串（所有文字都在内部）。

sentence = f.read()

Answer 3

阅读文件的最佳方式。

with open(filename, "r") as file:
    for i in file.readlines():
        print(i)
# end with (closes the file)

为什么输入函数没有从输入文本文件中读取所有行？蟒蛇

3 个答案: