Question

我做了一个for循环，虽然操作的变量量保持不变，但每次迭代的持续时间都会奇怪地增加。代码如下：

X [N * F]：一个numpy数组，其中N个样本包含F变量（特征）;
parts [N]：包含数字的numpy数组 X中每个样本的参与者;
model_filename：模型模板文件每个参与者的名称（即我每个参与者都有一个模型）

我的目标是将参与者p的模型应用于参与者p的数据并保存其输出（即N输出）。

outputs = np.full((X.shape[0],), np.nan)
for curr_part in np.unique(parts):
    print("processing participant {0}".format(curr_part))
    model = load_model(model_filename.format(curr_part)) # I measured the duration of this call (d0)
    idx = (parts == curr_part)
    outputs[idx] = np.squeeze(model.predict(X[idx,:])); # I measured the duration of this call (d1)

在循环的每次迭代中d1和d0都增加（整个循环在迭代0处花费1.5秒并且在迭代20处花费大约8秒）。我完全不明白为什么。同样有趣的是，如果我在ipython中多次运行代码，只要我不重启内核就会累积持续时间（即在第二次运行迭代0需要大约8秒）。当然我想多次运行代码，所以这个问题从长远来看是至关重要的。

我还尝试使用以下代码，大约需要。相同的总持续时间虽然我无法衡量每次通话的时间：

unik_parts = np.unique(parts);
models = [(p, load_model(model_filename.format(p))) for p in unik_parts]
outputs = [np.squeeze(m.predict(X[parts == p,:])) for p,m in models]

Python 2.7版

模型是来自keras的模型

Python循环在每次迭代中花费更多时间

0 个答案: