Question

我正在使用此代码在Python中查找字符串：

buildSucceeded = "Build succeeded."
datafile = r'C:\PowerBuild\logs\Release\BuildAllPart2.log'

with open(datafile, 'r') as f:
    for line in f:
        if buildSucceeded in line:
            print(line)

我很确定文件中有字符串，但它不会返回任何内容。

如果我只是逐行打印，它会返回很多“NUL＆＃39;每个＆＃34;有效＆＃34;之间的字符字符。

编辑1： 问题是Windows的编码。我在这篇文章后改变了编码，它起作用了：Why doesn't Python recognize my utf-8 encoded source file?

无论如何，文件看起来像这样：

Line 1.
Line 2.
...
Build succeeded.
    0 Warning(s)
    0 Error(s)
...

我目前正在使用Sublime for Windows编辑器进行测试 - 它输出了一个＆nbsp;＆＃39; NUL＆＃39;每个＆＃34;真实＆＃34;之间的性格很奇怪的人物。

使用python命令行我有这个输出：

C:\Dev>python readFile.py
Traceback (most recent call last):
  File "readFile.py", line 7, in <module>
    print(line)
  File "C:\Program Files\Python35\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 1: character maps to <undefined>

感谢您的帮助......

Answer 1

如果您的文件不是那么大，您可以进行简单的查找。否则我会检查文件以查看文件中是否有字符串/检查任何拼写错误的位置并尝试缩小问题范围。

f = open(datafile, 'r') lines = f.read() answer = lines.find(buildSucceeded) 另请注意，如果找不到字符串，则答案为-1。

Answer 2

如上所述，发生的问题与编码有关。在下面的网站上有一个非常好的解释，说明如何将文件与一种编码转换为其他编码。

我使用了最后一个示例（使用Python 3，这是我的情况）它按预期工作：

buildSucceeded = "Build succeeded."
datafile = 'C:\\PowerBuild\\logs\\Release\\BuildAllPart2.log'

# Open both input and output streams.
#input = open(datafile, "rt", encoding="utf-16")
input = open(datafile, "r", encoding="utf-16")
output = open("output.txt", "w", encoding="utf-8")

# Stream chunks of unicode data.
with input, output:
    while True:
        # Read a chunk of data.
        chunk = input.read(4096)
        if not chunk:
            break
        # Remove vertical tabs.
        chunk = chunk.replace("\u000B", "")
        # Write the chunk of data.
        output.write(chunk)

with open('output.txt', 'r') as f:
    for line in f:
        if buildSucceeded in line:
            print(line)

来源：http://blog.etianen.com/blog/2013/10/05/python-unicode-streams/

在Python中搜索文件中的字符串不起作用

2 个答案: