我在这里是机器学习的新手, 只需尝试使用tensorflow来了解一些基本概念。
我想问一下张量流如何精确地计算输出/登录层,特别是对于像tf.estimator.DNNClassifier这样的Pre-Made模型。
当我尝试从tensorflow估计器查看logits值,并尝试使用具有relu激活功能的常用Y = M*X + B
手动计算该值时,如果我尝试将hidden_units
中的节点大小增加2以上,则结果会有所不同。
这是我的代码:
import pandas as pd
import io
import tensorflow as tf
import numpy as np
from sklearn.model_selection import train_test_split
# set Numpy print option
np.set_printoptions(suppress=True)
np.set_printoptions(threshold=np.nan)
# prepare data
raw_data = pd.read_csv(io.StringIO(uploaded['dataset.csv'].decode('utf-8')),sep=';')
x_data = raw_data[['index0','index1','index2','index3','index4']]
y_label = raw_data['ClassType']
x_train, x_test, y_train, y_test = train_test_split(x_data, y_label, test_size=0.3, random_state=101)
# Print x_data and y_label head
x_data.head().as_matrix()
""" array([[0.19475628, 0.12746904, 0.04672323, 0.07501275, 0.07693333],
[0.66347056, 0.70985919, 0.70651323, 0.67363084, 0.67045463],
[0.54878968, 0.63825475, 0.63908431, 0.70264373, 0.71397793],
[0.43528058, 0.40889643, 0.40793766, 0.29906271, 0.41115305],
[0.44894346, 0.44471657, 0.42760782, 0.42890334, 0.43528058]])"""
y_label.head().as_matrix()
# array([1, 0, 0, 1, 0])
# Create Feature Columns
IDX0 = tf.feature_column.numeric_column("index0")
IDX1 = tf.feature_column.numeric_column("index1")
IDX2 = tf.feature_column.numeric_column("index2")
IDX3 = tf.feature_column.numeric_column("index3")
IDX4 = tf.feature_column.numeric_column("index4")
feat_cols = [IDX0,IDX1,IDX2,IDX3,IDX4]
# Declare Input Function
input_func = tf.estimator.inputs.pandas_input_fn(x=x_train,y=y_train,batch_size=100,num_epochs=100,shuffle=True)
# FIRST ATTEMPT, DNNClassifier with hidden_units=[2,2]
model = tf.estimator.DNNClassifier(hidden_units = [2,2], feature_columns=feat_cols,model_dir='MODEL',n_classes=2)
model.train(input_fn=input_func,steps=10000)
# try to predict one data
x1 = x_data.iloc[[15]]
test = x1.as_matrix()
# array([[0.44360469, 0.44796062, 0.50455245, 0.50898942, 0.41743509]])
y1 = y_label.iloc[[15]]
# array([0])
predOneFN = tf.estimator.inputs.pandas_input_fn(x=x1,batch_size=1,num_epochs=1,shuffle=False)
predOne_gen = model.predict(predOneFN)
list(predOne_gen)
"""[{'class_ids': array([0]),
'classes': array([b'0'], dtype=object),
'logistic': array([0.4911849], dtype=float32),
'logits': array([-0.03526414], dtype=float32),
'probabilities': array([0.5088151, 0.4911849], dtype=float32)}]"""
在那之后,我尝试使用Y =(权重)*(输入)+(偏置)来连接层来手动计算logit /输出:
# get and store that variable
a = model.get_variable_value('dnn/hiddenlayer_0/bias')
b = model.get_variable_value('dnn/hiddenlayer_0/kernel')
c = model.get_variable_value('dnn/hiddenlayer_1/bias')
d = model.get_variable_value('dnn/hiddenlayer_1/kernel')
k = model.get_variable_value('dnn/logits/bias')
l = model.get_variable_value('dnn/logits/kernel')
# y = mx + b, dense all neuron
mx = np.matmul(test,b)
mx = mx + a
mx = np.maximum(0,mx)
layer2 = np.matmul(mx,d)
layer2 = layer2 + c
# relu
layer2 = np.maximum(0,layer2)
out = np.matmul(layer2,l)
out = out + k
# out = array([[-0.03526414]])
此方法(y = mx + b)
与predOne_gen
和hidden_units[2,2]
的日志输出完全相同。
(-0.03526414 vs -0.03526414)
但是,当我将隐藏层更改为[3,3]并重复训练并预测一个值时,会得到不同的结果,如下所示:
list(predOne_gen)
"""[{'class_ids': array([0]),
'classes': array([b'0'], dtype=object),
'logistic': array([0.45599243], dtype=float32),
'logits': array([-0.17648707], dtype=float32),
'probabilities': array([0.5440076, 0.4559924], dtype=float32)}]"""
mx = np.matmul(test,b)
mx = mx + a
mx = np.maximum(0,mx)
layer2 = np.matmul(mx,d)
layer2 = layer2 + c
# relu
layer2 = np.maximum(0,layer2)
out = np.matmul(layer2,l)
out = out + k
# out = array([[-0.1764868]])
您会看到结果是不同的(-0.17648707与-0.1764868),尽管这种差异很小,但是当我尝试增加数据维度时,差异会变得更大越来越大。
那么,我怎么知道计算张量流logits /输出的确切公式?
感谢您的帮助 注意对不起,我的英语不好,ML代码不好。