线性回归-评估训练准确性

时间:2019-08-25 15:38:55

标签: python pandas tensorflow matplotlib linear-regression

我正在从CSV文件(1000行)中获取一些数据。它有两列,其中第一列是x_training数据,第二列是y_training数据。

CSV文件中的快照

8.070000000000000284e+01,1.126768031895251987e+01
8.040000000000000568e+01,1.195844519276935358e+01
7.250000000000000000e+01,8.317461617744008606e+00
1.030000000000000000e+02,1.880844309373589951e+01
1.075999999999999943e+02,1.947419293659330108e+01
7.940000000000000568e+01,9.877652348817933969e+00
8.190000000000000568e+01,1.127064360995226977e+01
1.015999999999999943e+02,1.640426417487080357e+01
1.085999999999999943e+02,1.749193091101176378e+01
9.570000000000000284e+01,1.574942514809519345e+01
5.270000000000000284e+01,3.581285321328901539e+00

获取数据后,我将各自的数据分配到列表中,并将它们转换为矩阵。

data = pd.read_csv('length_weight.csv', delimiter=",", dtype='float32')
x_train = np.mat(data.iloc[:, 0]).reshape(-1, 1)
y_train = np.mat(data.iloc[:, 1]).reshape(-1, 1)

为了计算训练精度,我们制作了LinearRegressionModel类。

class LinearRegressionModel:
    def __init__(self):
        # Model input
        self.x = tf.compat.v1.placeholder(tf.float32)
        self.y = tf.compat.v1.placeholder(tf.float32)

        # Model variables
        self.W = tf.Variable([[0.0]])
        self.b = tf.Variable([[0.0]])

        # Predictor
        f = tf.matmul(self.x, self.W) + self.b

        # Mean Squared Error
        self.loss = tf.reduce_mean(tf.square(f - self.y))


model = LinearRegressionModel()

# Training: adjust the model so that its loss is minimized
minimize_operation = tf.compat.v1.train.GradientDescentOptimizer(0.00001).minimize(model.loss)

# Create session object for running TensorFlow operations
session = tf.compat.v1.Session()

# Initialize tf.Variable objects
session.run(tf.compat.v1.global_variables_initializer())

for epoch in range(1000):
    session.run(minimize_operation, {model.x: x_train, model.y: y_train})

然后我运行该会话以获取计算结果,但是答案如下所示,只是NaN

W, b, loss = session.run([model.W, model.b, model.loss], {model.x: x_train, model.y: y_train})

哪个给出输出:

W = [[nan]], b = [[nan]], loss = nan

我已经尝试过使用这种数据,但效果很好:

x_train = np.mat([[1], [1.5], [2], [3], [4], [5], [6]])
y_train = np.mat([[5], [3.5], [3], [4], [3], [1.5], [2]])

我猜想这与我正在使用的数据的格式有关,但是我目前对我可以做的事情一无所知。非常感谢您的帮助。

0 个答案:

没有答案