Question

我正在运行一个仿真，需要从每个周期记录一些小的numpy数组。我当前的解决方案是加载，编写然后保存，如下所示：

existing_data = np.load("existing_record.npy")
updated = np.dstack((existing_data,new_array[...,None]))
np.save("existing_record.npy",updated)

这创建了一个很大的性能瓶颈，并且使用此方法进行仿真的速度仅为一半。我考虑过将numpy数组追加到列表中，并在模拟结束时将其写入，但这显然可能导致ram用完或在崩溃中丢失数据等。对于这种问题，是否有任何标准类型的解决方案？

Answer 1

我已经找到了使用h5py库的有效解决方案。由于没有读取数据，因此性能要好得多，而且我减少了nump数组附加操作的数量。一个简短的例子：

with h5py.File("logfile_name", "a") as f:
  ds = f.create_dataset("weights", shape=(3,2,100000), maxshape=(3, 2, None))
  ds[:,:,cycle_num] = weight_matrix

我不确定numpy样式切片是否会复制矩阵，但是有一个write_direct(source, source_sel=None, dest_sel=None)函数可以避免这种情况的发生，这可能对较大的矩阵有用。

Answer 2

我认为一种解决方案是通过numpy.memmap使用内存映射文件。该代码可以在下面找到。该文档包含了解代码的重要信息。

import numpy as np
from os.path import getsize

from time import time

filename = "data.bin"

# Datatype used for memmap
dtype = np.int32

# Create memmap for the first time (w+). Arbitrary shape. Probably good to try and guess the correct size.
mm = np.memmap(filename, dtype=dtype, mode='w+', shape=(1, ))
print("File has {} bytes".format(getsize(filename)))


N = 20
num_data_per_loop = 10**7

# Main loop to append data
for i in range(N):

    # will extend the file because mode='r+'
    starttime = time()
    mm = np.memmap(filename,
                   dtype=dtype,
                   mode='r+',
                   offset=np.dtype(dtype).itemsize*num_data_per_loop*i,
                   shape=(num_data_per_loop, ))
    mm[:] = np.arange(start=num_data_per_loop*i, stop=num_data_per_loop*(i+1))
    mm.flush()
    endtime = time()
    print("{:3d}/{:3d} ({:6.4f} sec): File has {} bytes".format(i, N, endtime-starttime, getsize(filename)))

A = np.array(np.memmap(filename, dtype=dtype, mode='r'))
if np.array_equal(A, np.arange(num_data_per_loop*N, dtype=dtype)):
    print("Correct")

我得到的输出是：

File has 4 bytes
  0/ 20 (0.2167 sec): File has 40000000 bytes
  1/ 20 (0.2200 sec): File has 80000000 bytes
  2/ 20 (0.2131 sec): File has 120000000 bytes
  3/ 20 (0.2180 sec): File has 160000000 bytes
  4/ 20 (0.2215 sec): File has 200000000 bytes
  5/ 20 (0.2141 sec): File has 240000000 bytes
  6/ 20 (0.2187 sec): File has 280000000 bytes
  7/ 20 (0.2138 sec): File has 320000000 bytes
  8/ 20 (0.2137 sec): File has 360000000 bytes
  9/ 20 (0.2227 sec): File has 400000000 bytes
 10/ 20 (0.2168 sec): File has 440000000 bytes
 11/ 20 (0.2141 sec): File has 480000000 bytes
 12/ 20 (0.2150 sec): File has 520000000 bytes
 13/ 20 (0.2144 sec): File has 560000000 bytes
 14/ 20 (0.2190 sec): File has 600000000 bytes
 15/ 20 (0.2186 sec): File has 640000000 bytes
 16/ 20 (0.2210 sec): File has 680000000 bytes
 17/ 20 (0.2146 sec): File has 720000000 bytes
 18/ 20 (0.2178 sec): File has 760000000 bytes
 19/ 20 (0.2182 sec): File has 800000000 bytes
Correct

由于用于memmap的偏移量，因此在迭代过程中时间大致恒定。另外，所需的RAM量（除了加载整个memmap进行最后的检查外）是恒定的。

我希望这可以解决您的性能问题

亲切的问候

卢卡斯

编辑1：张贴者似乎已经解决了自己的问题。我将这个答案留作选择。

在长时间运行的python模拟中记录数据

2 个答案: