'翻译(自我)'如何运作?

时间:2014-02-13 07:02:25

标签: python

class Keeper(object):

    def __init__(self, keep):
        self.keep = sets.Set(map(ord, keep))

    def __getitem__(self, n):
        if n not in self.keep:
            return None
        return unichr(n)

    def __call__(self, s):
        return unicode(s).translate(self)

makefilter = Keeper

if __name__ == '__main__':
    just_vowels = makefilter('aeiouy')

    print just_vowels(u'four score and seven years ago')   

它会发出“ouoeaeeyeaao”。

我知道'translate'函数应该返回一个由string.maketrans()创建的表参数。

但为什么'self'在translate函数中传递。

它如何调用__getitem__函数?

1 个答案:

答案 0 :(得分:3)

在我们发布您的代码段之前,让我先解释一下__getitem__的调用时间:

这是__getitem__所说的:

调用

__getitem__: object.__getitem__(self, key) 来实施 self [key] 的评估。

对于序列类型,接受的键应该是整数和切片对象。请注意,负索引的特殊解释(如果类希望模拟序列类型)取决于__getitem__()方法。如果密钥类型不合适,则可能会引发TypeError;如果序列的索引集之外的值(在对负值进行任何特殊解释之后),则应引发IndexError。对于映射类型,如果缺少密钥(不在容器中),则应引发KeyError

所以,让我们看一下以下片段:

class Keeper(object):
    def __init__(self, keep):
        self.keep = set(map(ord, keep))

if __name__ == '__main__':
    just_vowels = Keeper('aeiouy')
    print just_vowels[1]

输出:错误说does not support indexing,因为没有定义__getitem__方法。

Traceback (most recent call last):
  File "tran.py", line 15, in <module>
   print just_vowels[1]
TypeError: 'Keeper' object does not support indexing

现在让我们更改代码段并添加__getitem__以允许对象编制索引:

class Keeper(object):
    def __init__(self, keep):
        self.keep = set(map(ord, keep))

    def __getitem__(self, n):
        if n in self.keep:
            return unichr(n)
        else:
            return 'Not Found in %s' % self.keep

if __name__ == '__main__':
    just_vowels = Keeper('aeiouy')
    for i in range(97,103):
        print just_vowels[i]

输出:

a
Not Found in set([97, 101, 105, 111, 117, 121])
Not Found in set([97, 101, 105, 111, 117, 121])
Not Found in set([97, 101, 105, 111, 117, 121])
e
Not Found in set([97, 101, 105, 111, 117, 121])

所以,当我们使用self作为映射表时,最后让我们来看你的片段。字典。默认情况下,它将调用__getitem__方法以允许索引,以及哪些数字在[97, 101, 105, 111, 117, 121]范围内。因此,如果数字或ord值不在集合中,则只返回None,这意味着从unicode字符串中删除。

以下是一些支持数字索引的内置python对象:

>>> '__getitem__' in dir(dict)
True
>>> '__getitem__' in dir(list)
True
>>> '__getitem__' in dir(set)
False
>>> '__getitem__' in dir(tuple)
True
>>> '__getitem__' in dir(string)
False
>>>

关于集索引的示例:

>>> s
set([1, 2])
>>> s[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object does not support indexing
>>>

让我解释一下unicode翻译部分,我希望你已经知道了,但对于那些不知道的人来说。

这是unicode.translate所说的:

>>> help(unicode.translate)
Help on method_descriptor:

translate(...)
    S.translate(table) -> unicode
    Return a copy of the string S, where all characters have been mapped
    through the given translation table, which must be a mapping of
    Unicode ordinals to Unicode ordinals, Unicode strings or None.
    Unmapped characters are left untouched. Characters mapped to None
    are deleted.
>>

哪个需要table可以是字典,即将Unicode序列映射到Unicode序列,Unicode字符串或无。

让我们举个例子:从unicode字符串中删除标点:

>>> uni_string = unicode('String with PUnctu@tion!."##')
>>> uni_string
u'String with PUnctu@tion!."##'
>>>

让我们为标点符号创建一个映射字典为None:

>>> punc = '!"#$.'
>>> punc_map = {ord(x):None for x in punc }
>>> punc_map
{33: None, 34: None, 35: None, 36: None, 46: None}
>>>

让我们使用此punc_map转换unicode字符串以删除标点符号:

>>> uni_string
u'String with PUnctu@tion!."##'
>>> uni_string.translate(punc_map)
u'String with PUnctu@tion'
>>>