我不能让我的张量流梯度下降线性回归算法工作

时间:2017-10-04 22:40:03

标签: machine-learning tensorflow

我试着编写一个简单的张量流线性回归模型,它采用波士顿住房数据的一个子集,特别是房间数(RM) 将列作为自变量,将中值价格(MEDV)作为因变量,并对其应用梯度下降算法。

然而,当我运行它时,优化器似乎不起作用。成本从未降低,重量实际上在错误的方向上增加。

以下是我构建的各种情节

  1. x和y的散点图

  2. PCA分析图

  3. 原始数据适合

  4. 测试数据是否合适。

  5. 图片在这里:

    https://imgur.com/a/yVHC9

    我的程序输出如下:

      

    时代:0050成本= 6393135366144.000000000 W = 110392.0 b = 456112.0

         

    时代:0100成本= 6418308005888.000000000 W = 111131.0 b = 459181.0

         

    时代:0150成本= 6418496225280.000000000 W = 111136.0 b = 459203.0

         

    大纪元:0200成本= 6418497798144.000000000 W = 111136.0 b = 459203.0

         

    ...

         

    大纪元:1000成本= 6418497798144.000000000 W = 111136.0 b = 459203.0

    请注意,成本并没有降低,事实上,当它应该减少时,重量会略微增加。

    我不知道为什么会这样。数据似乎是合理的,据我所知,我不知道为什么优化器不起作用。 代码本身只是一个标准的张量流线性回归示例,我从互联网上取下并修改了我的数据集。

    import pandas as pd
    import matplotlib.pyplot as plt
    from matplotlib.mlab import PCA
    from mpl_toolkits.mplot3d import Axes3D
    import numpy as np
    import tensorflow as tf
    import sys
    from sklearn import model_selection
    from sklearn import preprocessing
    np.set_printoptions(precision=3,suppress=True)
    
    def pca(dataset):
    
        plt.scatter(dataset[:,0],dataset[:,1])
        plt.plot()
        plt.show()
        results = PCA(dataset)
        x = []
        y = []
    
        for item in results.Y:
            x.append(item[0])
            y.append(item[1])
    
        plt.close('all')
        fig1 = plt.figure()
        pltData = [x,y]
        plt.scatter(pltData[0],pltData[1],c='b')
        xAxisLine = ((min(pltData[0]),max(pltData[0])),(0,0),(0,0))
        yAxisLine = ((min(pltData[1]),max(pltData[1])),(0,0),(0,0))
        plt.xlabel('RM')
        plt.ylabel('MEDV')
        plt.show()
    
    
    rng = np.random
    # learning_rate is the alpha value that we pass to the gradient descent algorithm. 
    learning_rate = 0.1
    
    
    # How many cycles we're going to run to try and get our optimum fit. 
    training_epochs = 1000
    display_step =  50
    
    # We're going to pull in a the csv file and extract the X value (RM) and Y value (MEDV)
    
    boston_dataset = pd.read_csv('data/housing.csv')
    label = boston_dataset['MEDV']
    features = boston_dataset['RM'].reshape(-1,1)
    dataset = np.asarray(boston_dataset['RM'])
    dataset = np.column_stack((np.asarray(boston_dataset['RM']),np.asarray(boston_dataset['MEDV'])))
    
    pca(dataset)
    
    
    train_X, test_X, train_Y, test_Y = model_selection.train_test_split(features, label, test_size = 0.33, 
                                     random_state = 5)
    
    
    scaler =  preprocessing.StandardScaler()
    train_X = scaler.fit_transform(train_X)
    # This is the total number of data samples that we're going to run through. 
    n_samples = train_X.shape[0]
    
    # Variable placeholders. 
    X = tf.placeholder('float')
    Y = tf.placeholder('float')
    
    W = tf.Variable(rng.randn(), name = 'weight')
    b = tf.Variable(rng.randn(), name = 'bias')
    
    # Here we describe our training model.  It's a linear regression model using the standard y = mx + b 
    # point slope formula. We calculate the cost by using least mean squares.
    
    # This is our prediction algorithm: y = mx + b
    prediction = tf.add(tf.multiply(X,W),b)
    
    # Let's now calculate the cost of the prediction algorithm using least mean squares
    
    training_cost = tf.reduce_sum(tf.pow(prediction-Y,2))/(2 * n_samples)   
    # This is our gradient descent optimizer algorithm.  We're passing in alpha, our learning rate
    # and we want the minimum value of the training cost.  
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(training_cost)
    
    init = tf.global_variables_initializer()
    
    # Now we'll run our training data through our model.
    with tf.Session() as tf_session:
    
    # Initialize all of our tensorflow variables.
       tf_session.run(init)
    
    # We'll run the data through for 1000 times (The value of training_epochs). 
    
        for epoch in range(training_epochs):
    
    # For each training cycle, pass in the x and y values to our optimizer algorithm to calculate the cost.
            for (x,y) in zip(train_X,train_Y):
                tf_session.run(optimizer,feed_dict = {X: x, Y: y})
    
                # For every fifty cycles, let's check and see how we're doing. 
            if (epoch + 1 ) % 50 == 0:
                c = tf_session.run(training_cost,feed_dict = {X: train_X, Y: train_Y})
                print ('Epoch: ', '%04d' % 
                       (epoch+1),'cost=','{:.9f}'.format(c), \
                       'W = ',tf_session.run(W), 'b = ',tf_session.run(b))
    
    
    print ('Optimization finished')
    print ('Training cost = ',training_cost,' W = ',tf_session.run(W), ' b  = ', tf_session.run(b),'\n')
    
    plt.plot(train_X, train_Y, 'ro',label='Original data')
    
    plt.plot(train_X,tf_session.run(W) * train_X + tf_session.run(b), label = 'Fitted line')
    plt.legend()
    plt.show()
    
    # We're now going to run test data to see how well our trained model works. 
    
    print ('Testing...(mean square loss comparison)')
    testing_cost = tf_session.run(tf.reduce_sum(tf.pow(prediction - Y, 2)) / (2 * test_Y.shape[0]), feed_dict = {X: test_X, Y: test_Y})
    print ('Testing cost = ',testing_cost)
    print ('Absolute mean square loss difference: ', abs(training_cost  - testing_cost))
    
    plt.plot(test_X,test_Y,'bo',label='Testing data')
    
    plt.plot(test_X,tf_session.run(W) * test_X + tf_session.run(b), label = 'Fitted line')
    plt.legend()
     plt.show()
    `
    

    我真的想知道为什么优化器不能正常工作 所以,如果有人能指出我正确的方向,我将非常感激。

    由于

1 个答案:

答案 0 :(得分:1)

这可能与您的学习率有关。尝试减少它或在几个时期后更新。

例如,如果您使用100个时期,请尝试将学习率设置为0.01并在30个时期后将其降低到0.001,然后在30或40个时期后再次降至0.0001。

您可以查看像AlexNet这样的常见建筑,了解学习率的更新,以便您有所了解......

祝你好运