Question

我认为错误在于read函数。它无法读取图像中的特殊字符请参阅repr输出

我在python中使用了string.find（），如下所示：

indexOfClosedDoc = temp.find("</DOC>",indexOfOpenDoc)

但是，当字符串包含以下文本时：

SUB
</DOC>

其中SUB是一个特殊字符，temp.find找不到该标记。有关如何解决此问题的任何建议

示例：

enter image description here

导致其失败的代码：

handle = open("error.txt",'r');
temp = handle.read();
index = temp.find("</DOC>",0)
if(index == -1):
    print "Error"
    exit(1)

将图像文本放在文本文件中并运行代码

这是示例中文本的temp变量的repr。 eror.txt中的文本是图像

中第29722行的所有内容

' </P>\n\n'

注意：read（）函数永远不会读取超出SUB的内容，因此找不到问题

Answer 1

答案是使用'rb'模式打开文件。在Windows上，仅使用'r'打开文件将导致它使用停止在0x1A（DOS EOF字符）的旧DOS行为。另请参阅Line reading chokes on 0x1A

Answer 2

注意：如果文件使用多字节编码，那么即使.find()中没有0x1A，import codecs with codecs.open('file.utf16', 'w', encoding='utf-16') as file: file.write(u"abcd") # write a string using utf-16 encoding #XXX incorrect code don't use it with open('file.utf16', 'r') as f: temp = f.read() i = temp.find('bc') print i #XXX -> -1 not found with open('file.utf16', 'rb') as f: temp = f.read() i = temp.find('bc') print i #XXX -> -1 not found # works with codecs.open('file.utf16', encoding='utf-16') as f: temp = f.read() i = temp.find('bc') print i # -> 1 found也无效。例如：

{{1}}

Answer 3

检查您的indexOfOpenDoc值，我怀疑它是否大于显示的位置。

python中的string.find（）无法处理特殊字符

3 个答案: