当我使用包含ASCII
字母和ASCII
字符串的列表时,我使此代码正常工作,但我无法使其正常工作。
# -*- coding: utf-8 -*-
asa = ["ā","ē","ī","ō","ū","ǖ","Ā","Ē","Ī","Ō","Ū","Ǖ",
"á","é","í","ó","ú","ǘ","Á","É","Í","Ó","Ú","Ǘ",
"ǎ","ě","ǐ","ǒ","ǔ","ǚ","Ǎ","Ě","Ǐ","Ǒ","Ǔ","Ǚ",
"à","è","ì","ò","ù","ǜ","À","È","Ì","Ò","Ù","Ǜ"]
[x.decode('utf-8') for x in asa]
print list(set(asa) & set("ō"))
答案 0 :(得分:2)
你需要将你的角色放在一个列表中,因为字符串是可迭代的对象,你的unicode字符包含2个字节的字符串,因此python假定“ō”为\xc5
和\x8d
。
:
>>> list("ō")
['\xc5', '\x8d']
>>> print list(set(asa) & set(["ō"]))
['\xc5\x8d']
>>> print list(set(asa) & set(["ō"]))[0]
ō
答案 1 :(得分:1)
您的第一个集合包含"ō".decode('utf-8')
形式的元素(类型unicode
),相当于u"ō"
。
第二组包含"ō"
(类型str
)等字节字符串,因此它们不会比较相等而且没有交叉点。
Medidate:
>>> 'a' == u'a'
True
>>> 'ō' == u'ō'
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
False
>>> list('ō')
['\xc5', '\x8d']
>>> list(u'ō')
[u'\u014d']