__iter__和__getitem__之间有什么区别?

时间:2013-12-12 18:12:37

标签: python python-2.7 python-3.x

对我来说,这在Python 2.7.6和3.3.3中会发生。当我定义这样的类

class foo:
    def __getitem__(self, *args):
        print(*args)

然后尝试在一个实例上迭代(以及我认为会称之为iter),

bar = foo()
for i in bar:
    print(i)

它只是为一个args计算,并永远打印无。就语言设计而言,这是故意的吗?

示例输出

0
None
1
None
2
None
3
None
4
None
5
None
6
None
7
None
8
None
9
None
10
None

3 个答案:

答案 0 :(得分:20)

是的,这是一个预期的设计。它经过记录,经过充分测试,并且依赖于序列类型,例如 str

在Python拥有现代迭代器之前,__ getitem__版本是一个遗产。我们的想法是任何序列(可索引且具有长度的东西)都可以使用系列s [0],s [1],s [2],...直到 IndexError 自动迭代。或者引发 StopIteration

例如,在Python 2.7中,由于__getitem__方法( str 类型没有__iter__方法),字符串是可迭代的。

相比之下,迭代器协议允许任何类可迭代,而不必是可索引的(例如,dicts和sets)。

以下是如何使用序列的遗留样式创建可迭代类:

>>> class A:
        def __getitem__(self, index):
            if index >= 10:
                raise IndexError
            return index * 111

>>> list(A())
[0, 111, 222, 333, 444, 555, 666, 777, 888, 999]

以下是如何使用__iter__方法进行迭代:

>>> class B:
        def __iter__(self):
            yield 10
            yield 20
            yield 30


>>> list(B())
[10, 20, 30]

对于那些对细节感兴趣的人,相关代码在Objects / iterobject.c中:

static PyObject *
iter_iternext(PyObject *iterator)
{
    seqiterobject *it;
    PyObject *seq;
    PyObject *result;

    assert(PySeqIter_Check(iterator));
    it = (seqiterobject *)iterator;
    seq = it->it_seq;
    if (seq == NULL)
        return NULL;

    result = PySequence_GetItem(seq, it->it_index);
    if (result != NULL) {
        it->it_index++;
        return result;
    }
    if (PyErr_ExceptionMatches(PyExc_IndexError) ||
        PyErr_ExceptionMatches(PyExc_StopIteration))
    {
        PyErr_Clear();
        Py_DECREF(seq);
        it->it_seq = NULL;
    }
    return NULL;
}

并在Objects / abstract.c中:

int
PySequence_Check(PyObject *s)
{
    if (s == NULL)
        return 0;
    if (PyInstance_Check(s))
        return PyObject_HasAttrString(s, "__getitem__");
    if (PyDict_Check(s))
        return 0;
    return  s->ob_type->tp_as_sequence &&
        s->ob_type->tp_as_sequence->sq_item != NULL;
}

答案 1 :(得分:3)

__iter__是遍历可迭代对象的首选方法。如果未定义,解释器将尝试使用__getitem__来模拟其行为。看看here

答案 2 :(得分:1)

要获得您期望的结果,您需要具有有限len的数据元素并按顺序返回:

class foo:
    def __init__(self):
        self.data=[10,11,12]

    def __getitem__(self, arg):
        print('__getitem__ called with arg {}'.format(arg))
        return self.data[arg]

bar = foo()
for i in bar:
    print('__getitem__ returned {}'.format(i)) 

打印:

__getitem__ called with arg 0
__getitem__ returned 10
__getitem__ called with arg 1
__getitem__ returned 11
__getitem__ called with arg 2
__getitem__ returned 12
__getitem__ called with arg 3

或者您可以通过提升IndexError来表示'序列'的结束(虽然StopIteration也有效...):

class foo:
    def __getitem__(self, arg):
        print('__getitem__ called with arg {}'.format(arg))
        if arg>3:
            raise IndexError
        else:    
            return arg

bar = foo()
for i in bar:
    print('__getitem__ returned {}'.format(i))   

打印:

__getitem__ called with arg 0
__getitem__ returned 0
__getitem__ called with arg 1
__getitem__ returned 1
__getitem__ called with arg 2
__getitem__ returned 2
__getitem__ called with arg 3
__getitem__ returned 3
__getitem__ called with arg 4

for循环期望IndexErrorStopIteration表示序列的结束。