Question

在下面的代码中，我试图计算一组向量（numpy向量）的频率和总和

def calculate_means_on(the_labels, the_data):
    freq = dict();
    sums = dict();
    means = dict();
    total = 0;
    for index, a_label in enumerate(the_labels):
        this_data = the_data[index];
        if a_label not in freq:
            freq[a_label] = 1;
            sums[a_label] = this_data;
        else:
            freq[a_label] += 1;
            sums[a_label] += this_data;

假设the_data（一个numpy'矩阵'）最初是：

[[ 1.  2.  4.]
 [ 1.  2.  4.]
 [ 2.  1.  1.]
 [ 2.  1.  1.]
 [ 1.  1.  1.]]

运行上述代码后，the_data变为：

[[  3.   6.  12.]
 [  1.   2.   4.]
 [  7.   4.   4.]
 [  2.   1.   1.]
 [  1.   1.   1.]]

这是为什么？我已将其推断到sums[a_label] += this_data;行，因为当我将其更改为sums[a_label] = sums[a_label] + this_data;时，它的行为与预期一致;即，the_data未被修改。

Answer 1

这一行：

this_data = the_data[index]

获取the_data行的视图，而不是副本。视图由原始数组支持，并且变异视图将写入原始数组。

这一行：

sums[a_label] = this_data

将该视图插入sums dict，此行：

sums[a_label] += this_data

通过视图改变原始数组，因为+=在对象可变时请求通过变异而不是通过创建新对象来执行操作。

+ =使用numpy.array对象修改原始对象

1 个答案: