线性回归的一个问题是它倾向于使数据欠拟合,而解决此问题的一种方法是称为局部加权线性回归的技术。我已经在CS229 Lecture notes by Andrew Ng中了解了这种技术,并且还尝试制作以下脚本:
trX = np.linspace(0, 1, 100)
trY= trX + np.random.normal(0,1,100)
sess = tf.Session()
xArr = []
yArr = []
for i in range(len(trX)):
xArr.append([1.0,float(trX[i])])
yArr.append(float(trY[i]))
xMat = mat(xArr);
yMat = mat(yArr).T
A_tensor = tf.constant(xMat)
b_tensor = tf.constant(yMat)
m = shape(xMat)[0]
weights = mat(eye((m)))
k = 1.0
for j in range(m):
for i in range(m):
diffMat = xMat[i]- xMat[j,:]
weights[j,j] = exp(diffMat*diffMat.T/(-2.0*k**2))
weights_tensor = tf.constant(weights)
# Matrix inverse solution
wA = tf.matmul(weights_tensor, A_tensor)
tA_A = tf.matmul(tf.transpose(A_tensor), wA)
tA_A_inv = tf.matrix_inverse(tA_A)
product = tf.matmul(tA_A_inv, tf.transpose(A_tensor))
solution = tf.matmul(product, b_tensor)
solution_eval = sess.run(solution)
# Extract coefficients
slope = solution_eval[0][0]
y_intercept = solution_eval[1][0]
print('slope: ' + str(slope))
print('y_intercept: ' + str(y_intercept))
# Get best fit line
best_fit = []
for i in xArr:
best_fit.append(slope*i+y_intercept)
# Plot the results
plt.plot(xArr, yArr, 'o', label='Data')
plt.plot(xArr, best_fit, 'r-', label='Best fit line', linewidth=3)
plt.legend(loc='upper left')
plt.show()
运行上面的脚本时,发生错误: TypeError:'numpy.float64'对象不能解释为整数。该错误由语句引发:
best_fit.append(slope*i+y_intercept)
我已尝试解决此问题,但仍未找到解决方案。请帮助我。
答案 0 :(得分:0)
在循环中,i
是一个列表,例如[1.0, 1.0]
。您需要决定从列表中取什么值来乘以slope*i
。例如:
best_fit = []
for i in xArr:
best_fit.append(slope*i[0]+y_intercept)
列表中的第一个元素似乎总是等于1。
...
[1.0, 0.24242424242424243]
[1.0, 0.25252525252525254]
[1.0, 0.26262626262626265]
[1.0, 0.27272727272727276]
[1.0, 0.2828282828282829]
[1.0, 0.29292929292929293]
[1.0, 0.30303030303030304]
[1.0, 0.31313131313131315]
[1.0, 0.32323232323232326]
[1.0, 0.33333333333333337]
[1.0, 0.3434343434343435]
[1.0, 0.3535353535353536]
...
所以我认为您可能会在列表中寻找第二个元素(权重?)...
best_fit = []
for i in xArr:
best_fit.append(slope*i[1]+y_intercept)