使用具有TensorFlow背景的keras再现结果

时间:2018-10-06 10:22:29

标签: python tensorflow machine-learning keras regression

我正在使用带有tensorflow背景的keras构建简单的神经网络。我是ML的新手。

我的目标是获得可重复的结果。我已经阅读了许多博客来做到这一点,并实现了如下内容:

import numpy as np
import tensorflow as tf
import pandas as pd

from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)

# Parameters
learning_rate = .001
training_epochs = 2000
display_step = 5

train = pd.read_csv('T:/2.0 Pricing/45. Machine learning/R Training/Training-2017/Prince/Kaggle House Price Competition/House Prices - Advanced Regression Techniques/Data/train.csv')
test = pd.read_csv('T:/2.0 Pricing/45. Machine learning/R Training/Training-2017/Prince/Kaggle House Price Competition/House Prices - Advanced Regression Techniques/Data/test.csv')

train=train.loc[train['GrLivArea'] <4000]
train=train.loc[train['TotalBsmtSF'] <5000]
train=train.loc[train['1stFlrSF'] <3000]
train=train.loc[train['2ndFlrSF'] <2000]
train=train.loc[train['LotArea'] <80000]
train=train.loc[train['BsmtFinSF1'] <5000]

train_numeric=train.select_dtypes(exclude=['object'])
train_catagorical=train.select_dtypes(include=['object'])
train_catagorical=train_catagorical.apply(lambda col: pd.factorize(col, sort=True)[0])
train_variables=train_catagorical.join(train_numeric, how='outer')
#train=pd.get_dummies(train)
#train_variables=train.select_dtypes(include=variables)
#train_variables=train[variables]
train_variables=train_variables.fillna(train_variables.mean())
train_variables = train_variables.drop(labels=['SalePrice','Id'], axis=1)
train_variables=pd.DataFrame(train_variables,dtype='float32')
#train_variables=pd.DataFrame(train_variables[variablestouse],dtype='float32')
train_response=pd.DataFrame(train[['SalePrice']],dtype='float32')

test_numeric=test.select_dtypes(exclude=['object'])
test_catagorical=test.select_dtypes(include=['object'])
test_catagorical=test_catagorical.apply(lambda col: pd.factorize(col, sort=True)[0])
test_variables=test_catagorical.join(test_numeric, how='outer')
#test=pd.get_dummies(test)
#variables=list(test)
#test_variables=test.select_dtypes(exclude=[variables])
test_variables=test_variables.fillna(test_variables.mean())
test_variables = test_variables.drop(labels=['Id'], axis=1)
#test_variables=pd.DataFrame(test_variables[variablestouse],dtype='float32')
test_variables=pd.DataFrame(test_variables,dtype='float32')

n_samples = train.shape[0]

# Model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(100, input_dim=79, kernel_initializer='normal'))
model.add(tf.keras.layers.Dense(50, kernel_initializer='normal', activation='relu'))
model.add(tf.keras.layers.Dense(25, kernel_initializer='normal', activation='relu'))
model.add(tf.keras.layers.Dense(1, kernel_initializer='normal'))

# Compile model
model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam())

feature_cols = train_variables
labels = train_response

#np.random.seed(42)
model.fit(train_variables, train_response, epochs=50, batch_size=100)

# Predictions
feature_cols_test = test_variables

y = model.predict(np.array(feature_cols_test))

但是,我仍然无法复制结果。关于重现结果有什么建议吗?

0 个答案:

没有答案