Question

我发现当我使用python的set结构的add函数时，该元素似乎被添加到我无法弄清楚的位置。

>>> a=set([(0, 2)])
>>> a.add((0,4))
>>> a
set([(0, 2), (0, 4)])
>>> a.add((1,0))
>>> a
set([(1, 0), (0, 2), (0, 4)])
>>> a.add((2,5))
>>> a
set([(2, 5), (1, 0), (0, 2), (0, 4)])
>>> a.add((3,0))
>>> a
set([(3, 0), (2, 5), (1, 0), (0, 2), (0, 4)])
>>> a.add((1,6))
>>> a
set([(3, 0), (0, 2), (1, 6), (0, 4), (2, 5), (1, 0)])

可以看出，有时元素会在开头和其他时间，结尾或中间添加。在最后一个示例中，现有元素也进行了重新排序。

知道插入是如何发生的吗？

Answer 1

集合是无序的。 <元素在一个集合中“where”的概念是未定义的。

Answer 2

python中的集合是无序的。订单是任意的。

Answer 3

设置使用与dicts相同的哈希函数来添加元素。实际上，它们只是没有价值元素的字典。

This video可以帮助您更好地理解它。

如果使用整数，则按顺序排列（以“人类”排序的感觉）：

>>> s=set()
>>> for e in range(10):
...    s.add(e)
...    print s
... 
set([0])
set([0, 1])
set([0, 1, 2])
set([0, 1, 2, 3])
set([0, 1, 2, 3, 4])
set([0, 1, 2, 3, 4, 5])
set([0, 1, 2, 3, 4, 5, 6])
set([0, 1, 2, 3, 4, 5, 6, 7])
set([0, 1, 2, 3, 4, 5, 6, 7, 8])
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

但是如果你使用一个元组，它们就不会被“命令”到人眼：

>>> s=set()
>>> for t in ((i,i*i) for i in range(10)):
...    s.add(t)
...    print s
... 
set([(0, 0)])
set([(0, 0), (1, 1)])
set([(0, 0), (1, 1), (2, 4)])
set([(3, 9), (0, 0), (1, 1), (2, 4)])
set([(3, 9), (0, 0), (1, 1), (4, 16), (2, 4)])
set([(0, 0), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1)])
set([(6, 36), (0, 0), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1)])
set([(6, 36), (0, 0), (7, 49), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1)])
set([(6, 36), (0, 0), (7, 49), (4, 16), (5, 25), (3, 9), (2, 4), (1, 1), (8, 64)])
set([(6, 36), (0, 0), (7, 49), (4, 16), (5, 25), (3, 9), (9, 81), (2, 4), (1, 1), (8, 64)])

现在在解释器中尝试这两行：

>>> dict.fromkeys(range(10),None)
{0: None, 1: None, 2: None, 3: None, 4: None, 5: None, 6: None, 7: None, 8: None, 9: None}
>>> dict.fromkeys(((i,i*i) for i in range(10)),None)
{(6, 36): None, (0, 0): None, (7, 49): None, (4, 16): None, (5, 25): None, (3, 9): None, (9, 81): None, (2, 4): None, (1, 1): None, (8, 64): None}

您可以看到生成的dict与设置示例的顺序相同。

虽然dicts和使用ONLY int键设置可能<订购'，但从实际的角度来看， dicts和set没有订单。

如果您观看链接的视频，您就会明白原因。

Answer 4

元素根据其哈希值转到哈希表中的特定位置。对于具有相同的最后3位的元素，发生冲突并且为其选择一些其他点。哈希表一旦变为2/3满就会扩展，以降低冲突率。在哈希表上看到这个video。

>>> def bits(n):
    n+=2**32
    return bin(n)[-32:]

>>> bits(hash('a'))
'11100100000011011011000111100000' #last three bits are picked to determine the spot in hash table
>>> bits(hash('b'))
'11101011101011101101001101100011'

Answer 5

正如其他答案所说：

集应该表现得像没有订单
向集合添加内容的实际机制与字典相同;集合基本上是没有项目的密钥字典
字典基于哈希表（有关详情：What is the true difference between a dictionary and a hash table?）

我认为查看幕后发生的事情可能很有用，所以我评论了set的add()方法的实际源代码（抱歉滚动）。

def add(self, element):
    """Add an element to a set.

    This has no effect if the element is already present.
    """
    try:
        self._data[element] = True                             # self._data is the dictionary where all of the set elements are stored
    except TypeError:                                          # this try...except block catches the cases when element cannot be a dict key (that is, it isn't hashable)
        transform = getattr(element, "__as_immutable__", None) # the getattr() call is the same as trying to get element.__as_immutable__ and returning None if element doesn't have __as_immutable__
        if transform is None:                                     
            raise # re-raise the TypeError exception we caught # if we get to this line, then the element has no defined way to make it immutable (and thus hashable), so a TypeError is raised
        self._data[transform()] = True                         # so if we get here, transform() is the same as calling element.__as_immutable__() (which we know exists), and we can now add it to the self._data dict

正如您所看到的，一组add(element)与dict add(element)相同，除了它更难以哈希element。

Python - 使用set.add（）时添加元素的位置

5 个答案: