我有以下脚本正确识别ASCII和非ASCII行,但我想要每个文件的报告,而不是每行。由于我在循环中有 print ,并且我有很多文件,因此输出的数据太多了。 如何修改此代码以获得每个文件的单个输出?它应该告诉我文件中是否有任何非ASCII文本。
import os
for file in os.listdir('.'):
if file.endswith('.txt'):
with open(file) as f:
content = f.readlines()
for entry in content:
try:
entry.encode('ascii')
except UnicodeEncodeError:
print("it was not a ascii-encoded unicode string")
print(file)
else:
print("It may have been an ascii-encoded unicode string")
print(file)
答案 0 :(得分:1)
例如,如果要显示文件中是否有任何非ASCII字符串,则会保留一个标记,告诉您是否找到了错误的行。但是,您要等到文件结束才能报告。
import os
for file in os.listdir('.'):
if file.endswith('.txt'):
with open(file) as f:
content = f.readlines()
good_file = True
for entry in content:
try:
entry.encode('ascii')
except UnicodeEncodeError:
good_file = False
if good_file:
print("It may have been an ASCII-encoded unicode string")
else:
print("it was not an ASCII-encoded unicode string")
print(file)