Question

这是“机器学习行动”一书中的代码。 source code

传递给dataSet的是一个m * 3数组（datingTestSet2.txt，可以在上级目录中找到。）

我的问题是：

准备矩阵返回的优势是什么？（保存记忆？）

如果我不准备矩阵，会出错吗？（似乎没有。）

from numpy import *
def autoNorm(dataSet):
    minVals = dataSet.min(0)
    maxVals = dataSet.max(0)
    ranges = maxVals - minVals
    normDataSet = zeros(shape(dataSet)) # prepare matrix to return(It's my own comment, not in the source code. )
    # Because there is a similar code before it, 
    # I think it should be the same meaning. Or any means else?
    m = dataSet.shape[0]
    normDataSet = dataSet - tile(minVals, (m,1))
    normDataSet = normDataSet/tile(ranges, (m,1))   #element wise divide
    return normDataSet, ranges, minVals

Answer 1

没有优势。在您显示的代码中，normDataSet的第一个分配没有持久影响，因为稍后两行会有normDataSet的第二个分配。此时，先前绑定到zeros的{{1}}数组对象的引用计数达到零，并且该旧数组立即被垃圾收集。（当然，这是假设CPython，但在撰写本文时，没有任何替代Python实现完全支持NumPy。）

我猜这是作者的一个简单（但相对无害）的错误。我建议提交一份错误报告，以便修复。

顺便说一下，术语nit：normDataSet是数组，而不是矩阵。这很重要，因为NumPy 具有normDataSet类型，其乘法，除法和取幂的行为与常规matrix的行为不同。

Answer 2

除了不需要初始化normDataSet之外，根本不需要它。您可以直接修改.full { display: inline-block; position: relative; margin-top: 10px; display: inline-block; width: 100%; background-color: #fff; } .dialog { display: block; position: relative; margin-top: 10px; } .left { display: inline-block; position: absolute; width: 49%; left: 0; top: 0; bottom: 0; background-color: #fff; } .right { display: inline-block; position: absolute; width: 49%; right: 0; top: 0; bottom: 0; background-color: #fff; }数据集点，而不会影响传入的array。

一般来说，代码过于冗长和复杂，并且不会完全使用array。我不知道将numpy重新规范化为array中的[0,1]范围的buit-in函数，但可以使用{上的元素操作轻松完成{1}} numpy：

numpy

准备一个矩阵在python中返回有什么好处？

2 个答案: