迭代次数>批处理大小时代码崩溃

时间:2019-11-22 16:53:43

标签: python-3.x machine-learning linear-regression gradient-descent mini-batch

class MiniBatch:

    def __init__(self, batch_size):
        self.batch_size = batch_size

    def get_batches(self,X,y,batch_size,i):
        X_new = X[i:i+batch_size,:]
        y_new = y[i:i+batch_size]  
        return X_new, y_new

    def fit(self,X,y,alpha=0.01):
        n_iterations = 404
        n_iterations_ = []
        t_ = []
        gd_number_update_=[]
        self.cost_ = []
        gradient = 0
        λ = 0.01
        theta = np.zeros((X.shape[1], 1))

        t2_start = time.time() 
        for i in range(n_iterations):
            X_batch, y_batch = self.get_batches(X,y,self.batch_size,i)
            y_pred = np.dot(X_batch,theta)
            loss = y_pred - y_batch
            gradient = np.add(np.dot(X_batch.T,loss), np.sum(np.dot(λ, theta)))
            theta = theta - (alpha/self.batch_size) * gradient
            cost = sum(loss**2) + np.dot(np.sum((theta)**2), λ)  #regularized cost
            cost = cost/(2*self.batch_size)
            self.cost_.append(cost)
            t2_stop = time.time()
            z = (t2_stop-t2_start)
            t_.append(z)
            n_iterations_.append(i)
            if len(X_batch)>0:
              gd_number_update = round(404/len(X_batch))
              gd_number_update_.append(gd_number_update)
        return theta, n_iterations_, self.cost_, t_, gd_number_update_

batch_size = [1,16,128,256,404]

在上面的代码中,当n_iterations>批处理大小为404时,当我输入n_iterations = 500时,代码给出了此错误:

ValueError: x and y must have same first dimension, but have shapes (404,) and (500,)

我无法弄清楚我在这里犯的错误。你能帮忙吗?

0 个答案:

没有答案