import re
##EDIT didn't mean to copy filename = "rr.txt" ## opens file unicode file type
buffer = open('r.txt','r').read()
quotes = re.findall(ur'"[^"^\u201c]*["\u201d].*', buffer)
for quote in quotes:
print ''
print quote
## prints quotes found
## Problem is that the print output has rectangular blocks between each Character
为什么?
如果没有矩形块搞乱一切,你如何返回输出?
答案 0 :(得分:4)
你打错了。 Windows中的“Unicode”实际上是UTF-16LE。
buffer = codecs.open('r.txt', 'r', encoding='utf-16le').read()
答案 1 :(得分:2)
这与Python无关。您的控制台窗口呈现Python的输出,这会中断。
在控制台窗口中使用支持必要Unicode字符的字体。