Question

使用python 2.7.11

我正在阅读的utf-8文档中的破折号被if语句用于检测它们时被忽略。破折号以' - '字符打印到控制台，当显示为表示时显示为u' - '。通过ord（）传递的字符显示序号45，它与短划线字符相同。

segment = line[:section_widths[row_index]].strip()
line = line[section_widths[row_index]+1:]
if segment:
    print 'seg'
    if segment is u'-' or segment is '-':
        print 'DASH DETECTED'
        continue
    print "ord %d" % ord(segment[0])

Answer 1

请勿使用is进行相等性检查。使用==进行相等性检查。

>>> 'stringstringstringstringstring' == 'string' * 5
True
>>> 'stringstringstringstringstring' is 'string' * 5
False

is应该用于身份检查。

Answer 2

事实证明，Python 2.7.x的'is'对unicode字符串的影响与对ASCII字符串的影响不同。这种区别主要在这里解释：[String comparison in Python: is vs. ==]

每个unicode字符串都是一个对象，该对象与用于unicode文字的对象不同。

>>> uni = unicode('unicode')
>>> uni == 'unicode'
True
>>> uni is 'unicode'
False
>>> 
>>> asc = str('ascii')
>>> asc == 'ascii'
True
>>> asc is 'ascii'
True

编辑：

正如Mark Tolonen所指出的，这不是一致的行为。

>>> x=1
>>> x is 1
True
>>> x=10000
>>> x is 10000
False

（在darwin上运行Python 2.7.11 | Anaconda 2.4.0（x86_64）|（默认，2015年12月6日，18：57：58）[GCC 4.2.1（Apple Inc. build 5577）]

if语句未检测到Unicode Dash

2 个答案: