我正在使用此代码在Python中查找字符串:
buildSucceeded = "Build succeeded."
datafile = r'C:\PowerBuild\logs\Release\BuildAllPart2.log'
with open(datafile, 'r') as f:
for line in f:
if buildSucceeded in line:
print(line)
我很确定文件中有字符串,但它不会返回任何内容。
如果我只是逐行打印,它会返回很多“NUL'每个"有效"之间的字符字符。
编辑1: 问题是Windows的编码。我在这篇文章后改变了编码,它起作用了:Why doesn't Python recognize my utf-8 encoded source file?
无论如何,文件看起来像这样:
Line 1.
Line 2.
...
Build succeeded.
0 Warning(s)
0 Error(s)
...
我目前正在使用Sublime for Windows编辑器进行测试 - 它输出了一个 ' NUL'每个"真实"之间的性格很奇怪的人物。
使用python命令行我有这个输出:
C:\Dev>python readFile.py
Traceback (most recent call last):
File "readFile.py", line 7, in <module>
print(line)
File "C:\Program Files\Python35\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 1: character maps to <undefined>
感谢您的帮助......
答案 0 :(得分:0)
如果您的文件不是那么大,您可以进行简单的查找。否则我会检查文件以查看文件中是否有字符串/检查任何拼写错误的位置并尝试缩小问题范围。
f = open(datafile, 'r')
lines = f.read()
answer = lines.find(buildSucceeded)
另请注意,如果找不到字符串,则答案为-1。
答案 1 :(得分:0)
如上所述,发生的问题与编码有关。在下面的网站上有一个非常好的解释,说明如何将文件与一种编码转换为其他编码。
我使用了最后一个示例(使用Python 3,这是我的情况)它按预期工作:
buildSucceeded = "Build succeeded."
datafile = 'C:\\PowerBuild\\logs\\Release\\BuildAllPart2.log'
# Open both input and output streams.
#input = open(datafile, "rt", encoding="utf-16")
input = open(datafile, "r", encoding="utf-16")
output = open("output.txt", "w", encoding="utf-8")
# Stream chunks of unicode data.
with input, output:
while True:
# Read a chunk of data.
chunk = input.read(4096)
if not chunk:
break
# Remove vertical tabs.
chunk = chunk.replace("\u000B", "")
# Write the chunk of data.
output.write(chunk)
with open('output.txt', 'r') as f:
for line in f:
if buildSucceeded in line:
print(line)
来源:http://blog.etianen.com/blog/2013/10/05/python-unicode-streams/