列表和集合之间的区别之一是列表可以在迭代期间更改 - 我们可以在循环中附加到它等等。但是,如果我们尝试在for循环期间添加到集合,运行时提出错误。但是,Python如何检测set.add()是否在循环中使用然后引发运行时错误?如果我尝试重新创建我的伪列表类并在我的类的追加函数中引发运行时错误,我是否只是重载__iter__以防止任何追加?
举个例子:
a_set = {1,2,3,4}
a_list = [1,2,3,4]
for i in a_list:
a_list.append(5)
导致无限循环
for j in a_set:
a_set.add(5)
导致运行时错误。
它们都有__iter__函数,所以在我的伪列表类中,我应该如何重载__iter__,这样会像set一样引发运行时错误?
答案 0 :(得分:2)
进入for循环时,Python首先在 iterable 上调用iter
来获取或创建迭代器。然后循环从迭代器请求next
项,直到看到StopIteration
异常(除非流先前通过break
,return
语句退出循环,或者其他一些例外)。一个for循环,例如:
for element in iterable:
...
可以像这样重写:
it = iter(iterable)
while True:
try:
element = next(it)
except StopIteration:
break
...
现在,通过使用列表实例作为 iterable ,您将迭代不同的 iterator 类型,而不是使用set实例作为 iterable < / EM>:
>>> iter([0])
<list_iterator at 0xcafef00d>
>>> iter({0})
<set_iterator at 0xdeadbeef>
set_iterator
类型和list_iterator
类型实现__next__
的方式不同。以下是CPython中setiter_iternext
的changing size is guarded against函数。listiter_next
。 Screen-Shot没有这样的警卫。
我希望你现在可以看到如何在Python迭代器中直接创建类似的安全措施。定义__next__
方法时,可以检查大小是否已更改并提高:
class MyIterator:
def __init__(self, obj):
self.obj = obj # note: you may prefer to use a weakref here
self.it = iter(obj)
self.initial_size = len(obj)
def __iter__(self):
return self
def __next__(self):
if len(self.obj) != self.initial_size:
raise RuntimeError('changed size...doh!')
return next(self.it)
class GrumpyList:
def __init__(self, data):
self.data = data
def __iter__(self):
return MyIterator(self.data)
演示:
>>> for i in g:
... print(i)
... if i == 2:
... g.data.append(99)
...
0
1
2
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
...
RuntimeError: changed size...doh!
答案 1 :(得分:2)
迭代器中的所有迭代都返回。注意,对于一个集合,它实际上是引发错误的__next__
方法,并不一定在for循环中,尽管for循环在iterable上隐式调用__iter__
,然后调用{{1在结果迭代器上并将其分配给循环变量,并在每次迭代开始时继续这样做,直到引发__next__
(这是迭代器协议)。所以,请注意:
StopIteration
正如错误所暗示的那样,当集合更改大小时,就会调用错误。我们可以执行以下操作:
In [2]: s = {1,2,3}
In [3]: it = iter(s)
In [4]: next(it)
Out[4]: 1
In [5]: s.add(1)
In [6]: next(it)
Out[6]: 2
In [7]: s.add(99)
In [8]: next(it)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-58-2cdb14c0d4d6> in <module>()
----> 1 next(it)
RuntimeError: Set changed size during iteration
现在:
In [11]: class MyListIterator:
...: def __init__(self, origin):
...: self.origin = origin
...: self.original_size = len(origin)
...: self.i = 0
...: def __iter__(self):
...: return self
...: def __next__(self):
...: if len(self.origin) != self.original_size:
...: raise RuntimeError("MyList changed size during iteration!")
...: elif self.i == self.original_size:
...: raise StopIteration
...: x = self.origin.data[self.i]
...: self.i += 1
...: return x
...:
...: class MyList:
...: def __init__(self):
...: self.data = [1,2,3]
...: def __iter__(self):
...: return MyListIterator(self)
...: def __len__(self):
...: return len(self.data)
...: def append(self, item):
...: self.data.append(item)
...:
注意,您可以看到在引发错误之前打印了In [12]: mylist = MyList()
In [13]: for x in mylist:
...: print(x)
...:
1
2
3
In [14]: for x in mylist:
...: mylist.append(3)
...: print(x)
...:
1
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-14-3bd26e0c08b9> in <module>()
----> 1 for x in mylist:
2 mylist.append(3)
3 print(x)
4
<ipython-input-14-f69ab7d03470> in __next__(self)
8 def __next__(self):
9 if len(self.origin) != self.original_size:
---> 10 raise RuntimeError("MyList changed size during iteration!")
11 elif self.i == self.original_size:
12 raise StopIteration
RuntimeError: MyList changed size during iteration!
,这是因为直到第二次迭代之前的for循环隐式调用1
(或者第二次迭代的开始,但是你想要考虑它,那就是引发了错误。
答案 2 :(得分:0)
>>> x = {1,2,3,4,656,6,34,23,24,4,23,52}
>>> x
{1, 2, 3, 4, 34, 6, 656, 52, 23, 24}
>>> for i in x:
... x.add('what')
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: Set changed size during iteration
>>>
从前3行可以看出,集合不存储任何索引,这意味着它们没有被排序。由于它是无序的,因此如果在迭代期间添加新项目,则无法确定下一个项目是什么;因此运行时错误。
>>> class C:
... someiter = 1,2,3,4,5
... def __iter__(s):
... return iter(s.someiter)
...
>>> for i in C():
... print(i)
...
1
2
3
4
5
>>>
你自己提到了,当你把它放在for循环中时,python调用theobject .__ iter__;这意味着你不是简单地在你的集合中添加一些内容,而是首先调用iter,这就是python知道差异的方式。