Question

我有一个列表location / { proxy_pass http://127.0.0.1:8081; proxy_set_header X-Forwarded-Host $server_name; proxy_set_header X-Real-IP $remote_addr; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; add_header P3P 'CP="ALL DSP COR PSAa PSDa OUR NOR ONL UNI COM NAV"'; }（该列表包含utf8字符串）：

my_list

出于某种原因，排序列表（>>> len(my_list) 8777 >>> getsizeof(my_list) # <-- note the size 77848）使用更多内存：

my_sorted_list = sorted(my_list)

为什么>>> len(my_sorted_list) 8777 >>> getsizeof(my_sorted_list) # <-- note the size 79104返回一个列表，它在内存中占用的空间比初始未排序列表多？

Answer 1

作为Ignacio points out，这是由于Python分配的内存比所需的多一点。这样做是为了在列表上执行O(1) .appends。

提供的序列中的

sorted creates a new list sorts it in place并将其返回。要创建新列表，Python extends an empty sized list with the one passed;这导致观察到的过度分配（在调用list_resize之后发生）。您可以通过使用list.sort证实排序不是罪魁祸首;使用相同的算法而没有创建新的列表（或者，因为它已知，它是就地执行）。当然，那里的尺寸没有区别。

值得注意的是，这种差异主要出现在：

原始列表是使用list-comp创建的（如果空间可用且the final append doesn't trigger a resize，则大小更小）。
使用列表文字时。有 a PyList_New is created based on the number of values on the stack并且没有附加。 Direct assigning to the underlying array is performed ）不会触发任何调整大小并将大小保持在最小值：

所以，使用list-comp：

l = [i for i in range(10)]

getsizeof(l)          # 192
getsizeof(sorted(l))  # 200

或列表文字：

l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

getsizeof(l)          # 144
getsizeof(sorted(l))  # 200

尺寸较小（使用文字时更是如此）。

通过list创建时，内存总是过度分配; Python knows the sizes并通过根据大小过度分配位来取代未来的修改：

l = list(range(10))

getsizeof(l)          # 200
getsizeof(sorted(l))  # 200

所以你没有观察到列表大小的差异。

作为最后一点，我必须指出，这是行为特定的 Python的C实现，即CPython。这是语言如何实现的细节，因此，你不应该以任何古怪的方式依赖它。

Jython，IronPython，PyPy和任何其他实现可能/可能不具有相同的行为。

Answer 2

list resize operation进行全面分配，以便分摊附加到列表而不是从编译器预先分配的列表开始。

为什么排序列表大于未排序列表

2 个答案: