Question

当我致电sys.getsizeof(4)时，会返回14。假设这与C中的sizeof()相同，这是不可接受的高。

我想将内存数组用作一个大的原始字节数组。由于项目中数组的大小，内存开销是最重要的。可移植性也是一个巨大的问题，因此进入C或使用更具异国情调的库并不是最佳选择。

有没有办法强制Python使用较少的内存用于单个正面有符号字节列表或元组成员，只使用标准的Python 3？

Answer 1

考虑到Python对象必须至少具有a pointer to its type struct and a refcount。

，

14对我来说相当低

的PyObject

所有对象类型都是此类型的扩展名。这是一种类型，其中包含Python将对象指针视为对象所需的信息。在正常的“发布”版本中，它仅包含对象的引用计数和指向相应类型对象的指针。实际上没有任何东西被声明为PyObject，但是每个指向Python对象的指针都可以转换为PyObject *。必须使用宏Py_REFCNT和Py_TYPE来访问成员。

您将为每个Python对象提供此开销。降低开销/有效负载比率的唯一方法是拥有更多的有效负载，例如在数组（普通Python和numpy）中。

这里的技巧是数组元素通常不是Python对象，因此它们可以省去refcount和类型指针，并占用与底层C类型一样多的内存。

Answer 2

（his comment的帽子提示 martineau ...）

如果您只关注无符号字节（值[0,255]），那么最简单的答案可能是内置bytearray及其不可变的兄弟， bytes。一个潜在的问题是，它们旨在表示编码的字符串（读取或写入外部世界），因此它们的默认num_steps = 1001 t1 = time.time() with tf.Session(graph=graph) as session: merged = tf.summary.merge_all() writer = tf.summary.FileWriter('C:/Users/Dr_Chenxy/Documents/pylogs', session.graph) tf.global_variables_initializer().run() print("Initialized") for step in range(num_steps): # Pick an offset within the training data, which has been randomized. # Note: we could use better randomization across epochs. offset = (step * batch_size) % (train_labels.shape[0] - batch_size) # 1*128 % (200000 - 128) # Generate a minibatch. batch_data = train_dataset[offset:(offset + batch_size), :] # choose training set for this iteration batch_labels = train_labels[offset:(offset + batch_size), :] # choose labels for this iteration # Prepare a dictionary telling the session where to feed the minibatch. # The key of the dictionary is the placeholder node of the graph to be fed, # and the value is the numpy array to feed to it. feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels} _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict) if (step % 100 == 0): print("Minibatch loss at step %d: %f" % (step, l)) print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels)) print("Validation accuracy: %.1f%%" % accuracy( valid_prediction.eval(), valid_labels)) print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels)) t2 = time.time() print('Running time', t2-t1, 'seconds')是＆＃34;类似字符串＆＃34;，而不是整数列表：< / p>

__repr__

请注意，空格>>> lst = [0x10, 0x20, 0x30, 0x41, 0x61, 0x7f, 0x80, 0xff] >>> bytearray(lst) bytearray(b'\x10 0Aa\x7f\x80\xff') >>> bytes(lst) b'\x10 0Aa\x7f\x80\xff'，'0'和'A'按字面意思显示，而＆＃34;不可打印＆＃34;值显示为'a'字符串转义序列。如果您试图将这些字节视为一堆整数，那么这不是您想要的。

对于固定宽度整数或浮点数的同构数组（非常类似于C），请使用标准库array module。

'\x##'

对于更复杂的数据，struct module用于打包异构记录，非常类似于C＆＃39 >>> import array # One megabyte of unsigned 8-bit integers. >>> a = array.array('B', (n % 2**8 for n in range(2**20))) >>> len(a) 1048576 >>> a.typecode 'B' >>> a.itemsize 1 >>> a.buffer_info() # Memory address, memory size. (24936384, 1048576) >>> a_slice = a[slice(1024, 1040)] # Can be sliced like a list. >>> a_slice array('B', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) >>> type(a_slice) # Slice is also an array, not a list. <class 'array.array'>个关键字。与C不同，我没有看到任何明显的方法来创建struct array s。

这些数据结构都使用Python的Buffer Protocol，它（至少在CPython中）允许Python类将其内部类C数组直接暴露给其他Python代码。如果你需要做一些复杂的事情，你可能需要学习这个...... 或者放弃并使用NumPy。

如何限制用于存储整数的内存量？

2 个答案: