关于局部加权线性回归问题

时间:2019-05-06 06:50:59

标签: python tensorflow machine-learning

线性回归的一个问题是它倾向于使数据欠拟合,而解决此问题的一种方法是称为局部加权线性回归的技术。我已经在CS229 Lecture notes by Andrew Ng中了解了这种技术,并且还尝试制作以下脚本:

trX = np.linspace(0, 1, 100) 
trY= trX + np.random.normal(0,1,100)

sess = tf.Session()
xArr = []
yArr = []
for i in range(len(trX)):
    xArr.append([1.0,float(trX[i])])
    yArr.append(float(trY[i]))

xMat = mat(xArr); 
yMat = mat(yArr).T

A_tensor = tf.constant(xMat)
b_tensor = tf.constant(yMat)

m = shape(xMat)[0]
weights = mat(eye((m)))
k = 1.0
for j in range(m):
    for i in range(m):
        diffMat = xMat[i]- xMat[j,:]
        weights[j,j] = exp(diffMat*diffMat.T/(-2.0*k**2))

weights_tensor = tf.constant(weights)
# Matrix inverse solution
wA = tf.matmul(weights_tensor, A_tensor)
tA_A = tf.matmul(tf.transpose(A_tensor), wA)
tA_A_inv = tf.matrix_inverse(tA_A)
product = tf.matmul(tA_A_inv, tf.transpose(A_tensor))
solution = tf.matmul(product, b_tensor)

solution_eval = sess.run(solution)

# Extract coefficients
slope = solution_eval[0][0]
y_intercept = solution_eval[1][0]

print('slope: ' + str(slope))
print('y_intercept: ' + str(y_intercept))

# Get best fit line

best_fit = []
for i in xArr:
  best_fit.append(slope*i+y_intercept)

# Plot the results
plt.plot(xArr, yArr, 'o', label='Data')
plt.plot(xArr, best_fit, 'r-', label='Best fit line', linewidth=3)
plt.legend(loc='upper left')
plt.show()

运行上面的脚本时,发生错误: TypeError:'numpy.float64'对象不能解释为整数。该错误由语句引发:

best_fit.append(slope*i+y_intercept)

我已尝试解决此问题,但仍未找到解决方案。请帮助我。

1 个答案:

答案 0 :(得分:0)

在循环中,i是一个列表,例如[1.0, 1.0]。您需要决定从列表中取什么值来乘以slope*i。例如:

best_fit = []
for i in xArr:
    best_fit.append(slope*i[0]+y_intercept)

列表中的第一个元素似乎总是等于1。

...
[1.0, 0.24242424242424243]
[1.0, 0.25252525252525254]
[1.0, 0.26262626262626265]
[1.0, 0.27272727272727276]
[1.0, 0.2828282828282829]
[1.0, 0.29292929292929293]
[1.0, 0.30303030303030304]
[1.0, 0.31313131313131315]
[1.0, 0.32323232323232326]
[1.0, 0.33333333333333337]
[1.0, 0.3434343434343435]
[1.0, 0.3535353535353536]
...

所以我认为您可能会在列表中寻找第二个元素(权重?)...

best_fit = []
for i in xArr:
    best_fit.append(slope*i[1]+y_intercept)