Python中检查字符序号的不等式

时间:2019-10-04 00:27:01

标签: python function inequality

在编写33 <= cp <= 47cp >= 33 and cp <= 47之类的东西时有区别吗?

更具体地说,如果有一个函数可以做到:

def _is_punctuation(char):
  """Checks whether `chars` is a punctuation character."""
  cp = ord(char)
  if ((cp >= 33 and cp <= 47) or (cp >= 58 and cp <= 64) or
      (cp >= 91 and cp <= 96) or (cp >= 123 and cp <= 126)):
    return True
  else:
    return False

是否与

相同?
def is_punctuation(char):
    """Checks whether `chars` is a punctuation character."""
    # Treat all non-letter/number ASCII as punctuation.
    # Characters such as "^", "$", and "`" are not in the Unicode
    # punctuation class but treat them as punctuation anyways, for consistency.
    cp = ord(char)
    if (33 <= cp <= 47) or (58 <= cp <= 64) or (91 <= cp <= 96) or (123 <= cp <= 126):
        return True
    return False

有没有理由更喜欢_is_punctuation()而不是is_punctuation(),或者相反?

一个计算速度是否会比另一个计算速度更快?如果是这样,我们如何验证呢?使用dis.dis吗?


P / S:我要问的是,因为找不到原因,为什么Google AI工程师会偏爱https://github.com/google-research/bert/blob/master/tokenization.py#L386上的原始_is_punctuation实现

1 个答案:

答案 0 :(得分:1)

否,它们在语义上是相同的。您还可以返回条件而不是使用if子句,因为它无论如何都会评估为布尔值:

return (33 <= cp <= 47) or (58 <= cp <= 64) or (91 <= cp <= 96) or (123 <= cp <= 126)

他们(Google AI工程师)可能不知道链式比较,或者他们wanted it to perform slightly better