Question

我正在阅读一个文本文件，我知道它的第38行是“Uncalibrated Peaks：”，我知道它存储在我列表的第38个元素中。我已经检查了它们，没有索引问题。

我正在通过以下代码阅读文本文件

import os

fd = open('Report.txt')
contents = fd.readlines()
fd.close()

for ind, line in enumerate(contents):
    line = line.split(" ")
    contents[ind] = line

但我们通过

检查第38行中第一个单词的实例长度

print len(contents[38][0])

25，我知道这个命令是提到列表中的正确元素，所以没有索引问题

print len('Uncalibrated')

12

!!虽然理论上他们应该是一样的。似乎每个字符在字符串向量中占2位，这似乎是因为unicodeing问题

Answer 1

通常，如果字符串中的字符看起来“太宽”，则可能有一个unicode文件。尝试使用unicode function转换它。

查看上面的代码，看起来更像是一个简单的索引错误。

Answer 2

你试过contents[37][0]吗？第38行应该在索引37处，因为索引从0开始。

Answer 3

试

if ind == 38:
   print line
line = line.split()

验证它是否是您想要的行并将其拆分。就像上面的海报所说，你也可能误读了这条线。

Answer 4

fd = open('foo.html')
contents = fd.readlines()
fd.close()

for ind, line in enumerate(contents):
    line = line.split(" ")
    contents[ind] = line

print contents,'\n\n------------------'


fd = open('foo.html')
li = fd.readlines()
fd.close()

a = map(lambda x: x.split(" "),li)
print a,'\n',a==contents,'\n\n------------------'


fd = open('foo.html')
b = [line.split(" ") for line in fd]
fd.close()

print b,'\n',b==contents

从python中的文本文件读取后奇怪的字符串行为

4 个答案: