Question

我在管理str.split()的内存消耗方面遇到了困难。看起来它消耗的内存比我预期的要少几倍

比较以下脚本：

import memory_profiler
t= [str.split("one two three four five") for l in range(50000000)]
memory_profiler.memory_usage()

[21748.140625]

仅存储此类列表的基准：

import memory_profiler
t= [["one", "two", "three", "four", "five"] for l in range(50000000)]
memory_profiler.memory_usage()

[5468.6015625]

我的python版本是：

3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]

请注意：我需要内存中的数据以供以后处理（它必须在内存中，因此我可以使用多个工作人员），因此使用生成器不是解决方案。整个数据集至少在内存中完美匹配3次，但我需要将其拆分为单词列表。