我现在正在进行线性回归分析。输入变量是Size。输出变量是Price。我将数据集存储在2D数组中。我知道使用NumPy很容易进行分析,但我的教授告诉我只使用for循环来执行迭代。 Formula of interation is shown as the picture in the hyperlink。因此,我决定使用以下代码来执行计算:
#Structure of array (Stored in float), with structure like this [Room, Price]
array = [[4.0, 399.9], [5.0, 329.9], [6.0, 369.0]]
#Set initial value
theta_price = 0
theta_room = 0
stepsize = 0.01
item = 3
#Perform iterations
for looping in range(0, 50): #Loop 50 times
for j in array[0]: #Call the value stored in array[0]
for k in array[1]: #Call the value stored in array[1]
theta_price_1 = theta_price - stepsize * (1 / item) * (sum((theta_price + theta_room * int(j) - int(k)))#Perform iterations of theta 0
theta_room_1 = theta_room - stepsize * (1 / item) * (sum((theta_price + t + theta_room * int(j) - int(k))*int(j)))#Perform iterations of theta 1
#Bring the new theta value to the next loop
theta_price = theta_price_1
theta_room = theta_room_1
print(theta_price,theta_room)#Print the result for every loop
以上代码无法在第10行显示错误消息:
'int' object is not iterable
但是如果我删除sum函数,它会使用不正确的计算结果。因此,我知道sum函数和数组有一些问题,但我不知道如何解决它?
答案 0 :(得分:0)
正如我在评论中提到的那样,sum
应该应用于每次迭代中的所有元素,这就是Batch Gradient Descent所做的。所以代码应该是:
theta_price = 0
theta_room = 0
stepsize = 0.1
item = 5
#Perform iterations
array = [
[0,1,2,3,4],
[5,6,7,8,9],
]
for looping in range(0, 500): #Loop 50 times
theta_price = theta_price - stepsize * (1 / item) * (sum([theta_price + theta_room * int(j) - int(k) for j, k in zip(array[0], array[1])]))#Perform iterations of theta 0
theta_room = theta_room - stepsize * (1 / item) * (sum([(theta_price + theta_room * int(j) - int(k)) * int(j) for j, k in zip(array[0], array[1])]))#Perform iterations of theta 1
print(theta_price,theta_room)#Print the result for every loop
在使用5个测试数据进行500次迭代后,我可以得到结果:
4.999999614653767 1.0000001313279816
是预期的。