我在两类问题中有非常不平衡的数据,我试图使用TensorFlow来解决NN问题。我找到了一个确切描述我遇到的困难的帖子,并给出了一个似乎可以解决我的问题的解决方案。但是我和一个助手一起工作,我们都不知道python,所以TensorFlow就像黑盒子一样用于我们。我在各种范例中使用各种编程语言拥有丰富的(数十年)经验。这段经历让我能够很好地直观地掌握我的助手拼凑在一起以获得工作模型的代码,但是我们都不能跟踪正在发生的事情,能够准确地告诉TensorFlow我们在哪里需要进行编辑以获得我们想要的东西。
我希望有一个熟悉Python和TensorFlow的人可以看看这个并告诉我们类似的事情,“嘿,只需编辑名为xxx的文件和yyy的行,”所以我们可以继续它
下面,我有一个指向我们想要实现的解决方案的链接,并且我还包括了我的助手编写的代码,该代码最初启动并运行。当我们的数据平衡时,我们的代码会产生良好的结果,但是当高度不平衡时,它倾向于将偏向较大类的所有内容分类以获得更好的结果。
以下是我们发现的解决方案的链接:
Loss function for class imbalanced binary classifier in Tensor flow
我在下面的链接中包含了相关代码。由于我知道我们在哪里进行这些编辑将取决于我们如何使用TensorFlow,我还将我们的实现立即包含在相同的代码块中,并附有相应的注释,以明确我们想要添加的内容以及我们当前的内容这样做的:
# Here is the stuff we need to add some place in the TensorFlow source code:
ratio = 31.0 / (500.0 + 31.0)
class_weight = tf.constant([[ratio, 1.0 - ratio]])
logits = ... # shape [batch_size, 2]
weight_per_label = tf.transpose( tf.matmul(labels
, tf.transpose(class_weight)) ) #shape [1, batch_size]
# this is the weight for each datapoint, depending on its label
xent = tf.mul(weight_per_label
, tf.nn.softmax_cross_entropy_with_logits(logits, labels, name="xent_raw") #shape [1, batch_size]
loss = tf.reduce_mean(xent) #shape 1
# NOW HERE IS OUR OWN CODE TO SHOW HOW WE ARE USING TensorFlow:
# (Obviously this is not in the same file in real life ...)
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
import numpy as np
from math import exp
from PreProcessData import load_and_process_training_Data,
load_and_process_test_data
from PrintUtilities import printf, printResultCompare
tf.set_random_seed(0)
#==============================================================
# predefine file path
''' Unbalanced Training Data, hence there are 1:11 target and nontarget '''
targetFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Train1-35/tar.txt'
nontargetFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Train1-35/nontar.txt'
testFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Test41/feats41.txt'
labelFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Test41/labs41.txt'
# train_x,train_y =
load_and_process_training_Data(targetFilePath,nontargetFilePath)
train_x, train_y =
load_and_process_training_Data(targetFilePath,nontargetFilePath)
# test_x,test_y = load_and_process_test_data(testFilePath,labelFilePath)
test_x, test_y = load_and_process_test_data(testFilePath,labelFilePath)
# trained neural network path
save_path = "nn_saved_model/model.ckpt"
# number of classes
n_classes = 2 # in this case, target or non_target
# number of hidden layers
num_hidden_layers = 1
# number of nodes in each hidden layer
nodes_in_layer1 = 40
nodes_in_layer2 = 100
nodes_in_layer3 = 30 # We think: 3 layers is dangerous!! try to avoid it!!!!
# number of data features in each blocks
block_size = 3000 # computer may not have enough memory, so we divide the train into blocks
# number of times we iterate through training data
total_iterations = 1000
# terminate training if computed loss < supposed loss
expected_loss = 0.1
# max learning rate and min learnign rate
max_learning_rate = 0.002
min_learning_rate = 0.0002
# These are placeholders for some values in graph
# tf.placeholder(dtype, shape=None(optional), name=None(optional))
# It's a tensor to hold our datafeatures
x = tf.placeholder(tf.float32, [None,len(train_x[0])])
# Every row has either [1,0] for targ or [0,1] for non_target. placeholder to hold one hot value
Y_C = tf.placeholder(tf.int8, [None, n_classes])
# variable learning rate
lr = tf.placeholder(tf.float32)
# neural network model
def neural_network_model(data):
if (num_hidden_layers == 1):
# layers contain weights and bias for case like all neurons fired a 0 into the layer, we will need result out
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
hidden_1_layer = {'weights': tf.Variable(tf.random_normal([len(train_x[0]), nodes_in_layer1])),
'bias': tf.Variable(tf.ones([nodes_in_layer1]) / 10)}
# no more bias when come to the output layer
output_layer = {'weights': tf.Variable(tf.random_normal([nodes_in_layer1, n_classes])),
'bias': tf.Variable(tf.zeros([n_classes]))}
# multiplication of the raw input data multipled by their unique weights (starting as random, but will be optimized)
l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['bias'])
l1 = tf.nn.relu(l1)
# We repeat this process for each of the hidden layers, all the way down to our output, where we have the final values still being the multiplication of the input and the weights, plus the output layer's bias values.
Ylogits = tf.matmul(l1, output_layer['weights']) + output_layer['bias']
if (num_hidden_layers == 2):
# layers contain weights and bias for case like all neurons fired a 0 into the layer, we will need result out
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
hidden_1_layer = {'weights': tf.Variable(tf.random_normal([len(train_x[0]), nodes_in_layer1])),
'bias': tf.Variable(tf.ones([nodes_in_layer1]) / 10)}
hidden_2_layer = {'weights': tf.Variable(tf.random_normal([nodes_in_layer1, nodes_in_layer2])),
'bias': tf.Variable(tf.ones([nodes_in_layer2]) / 10)}
# no more bias when come to the output layer
output_layer = {'weights': tf.Variable(tf.random_normal([nodes_in_layer2, n_classes])),
'bias': tf.Variable(tf.zeros([n_classes]))}
# multiplication of the raw input data multipled by their unique weights (starting as random, but will be optimized)
l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['bias'])
l1 = tf.nn.relu(l1)
l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['bias'])
l2 = tf.nn.relu(l2)
# We repeat this process for each of the hidden layers, all the way down to our output, where we have the final values still being the multiplication of the input and the weights, plus the output layer's bias values.
Ylogits = tf.matmul(l2, output_layer['weights']) + output_layer['bias']
if (num_hidden_layers == 3):
# layers contain weights and bias for case like all neurons fired a 0 into the layer, we will need result out
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
hidden_1_layer = {'weights':tf.Variable(tf.random_normal([len(train_x[0]), nodes_in_layer1])), 'bias':tf.Variable(tf.ones([nodes_in_layer1]) / 10)}
hidden_2_layer = {'weights':tf.Variable(tf.random_normal([nodes_in_layer1, nodes_in_layer2])), 'bias':tf.Variable(tf.ones([nodes_in_layer2]) / 10)}
hidden_3_layer = {'weights':tf.Variable(tf.random_normal([nodes_in_layer2, nodes_in_layer3])), 'bias':tf.Variable(tf.ones([nodes_in_layer3]) / 10)}
# no more bias when come to the output layer
output_layer = {'weights':tf.Variable(tf.random_normal([nodes_in_layer3, n_classes])), 'bias':tf.Variable(tf.zeros([n_classes]))}
# multiplication of the raw input data multipled by their unique weights (starting as random, but will be optimized)
l1 = tf.add(tf.matmul(data,hidden_1_layer['weights']), hidden_1_layer['bias'])
l1 = tf.nn.relu(l1)
l2 = tf.add(tf.matmul(l1,hidden_2_layer['weights']), hidden_2_layer['bias'])
l2 = tf.nn.relu(l2)
l3 = tf.add(tf.matmul(l2,hidden_3_layer['weights']), hidden_3_layer['bias'])
l3 = tf.nn.relu(l3)
# We repeat this process for each of the hidden layers, all the way down to our output, where we have the final values still being the multiplication of the input and the weights, plus the output layer's bias values.
Ylogits = tf.matmul(l3,output_layer['weights']) + output_layer['bias']
return Ylogits # return the neural network model
# set up the training process
def train_neural_network(x):
# produce the prediction base on output of nn model
Ylogits = neural_network_model(x)
# measure the error use build in cross entropy function, the value that we want to minimize
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_C))
# To optimize our cost (cross_entropy), reduce error, default learning_rate is 0.001, but you can change it, this case we use default
# optimizer = tf.train.GradientDescentOptimizer(0.003)
optimizer = tf.train.AdamOptimizer(lr)
train_step = optimizer.minimize(cross_entropy)
# start the session
with tf.Session() as sess:
# We initialize all of our variables first before start
sess.run(tf.global_variables_initializer())
# iterate epoch count time (cycles of feed forward and back prop), each epoch means neural see through all train_data once
for epoch in range(total_iterations):
# count the total cost per epoch, declining mean better result
epoch_loss=0
i=0
decay_speed = 150
# current learning rate
learning_rate = min_learning_rate + (max_learning_rate - min_learning_rate) * exp(-epoch/decay_speed)
# divide the dataset in to data_set/batch_size in case run out of memory
while i < len(train_x):
# load train data
start = i
end = i + block_size
batch_x = np.array(train_x[start:end])
batch_y = np.array(train_y[start:end])
train_data = {x: batch_x, Y_C: batch_y, lr: learning_rate}
# train
# sess.run(train_step,feed_dict=train_data)
# run optimizer and cost against batch of data.
_, c = sess.run([train_step, cross_entropy], feed_dict=train_data)
epoch_loss += c
i+=block_size
# print iteration status
printf("epoch: %5d/%d , loss: %f", epoch, total_iterations, epoch_loss)
# terminate training when loss < expected_loss
if epoch_loss < expected_loss:
break
# how many predictions we made that were perfect matches to their labels
# test model
# test data
test_data = {x:test_x, Y_C:test_y}
# calculate accuracy
correct_prediction = tf.equal(tf.argmax(Ylogits, 1), tf.argmax(Y_C, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))
print('Accuracy:',accuracy.eval(test_data))
# result matrix, return the position of 1 in array
result = (sess.run(tf.argmax(Ylogits.eval(feed_dict=test_data),1)))
answer = []
for i in range(len(test_y)):
if test_y[i] == [0,1]:
answer.append(1)
elif test_y[i]==[1,0]:
answer.append(0)
answer = np.array(answer)
printResultCompare(result,answer)
# save the prediction of correctness
np.savetxt('nn_prediction.txt', Ylogits.eval(feed_dict={x: test_x}), delimiter=',',newline="\r\n")
# save the nn model for later use again
# 'Saver' op to save and restore all the variables
saver = tf.train.Saver()
saver.save(sess, save_path)
#print("Model saved in file: %s" % save_path)
# load the trained neural network model
def test_loaded_neural_network(trained_NN_path):
Ylogits = neural_network_model(x)
saver = tf.train.Saver()
with tf.Session() as sess:
# load saved model
saver.restore(sess, trained_NN_path)
print("Loading variables from '%s'." % trained_NN_path)
np.savetxt('nn_prediction.txt', Ylogits.eval(feed_dict={x: test_x}), delimiter=',',newline="\r\n")
# test model
# result matrix
result = (sess.run(tf.argmax(Ylogits.eval(feed_dict={x:test_x}),1)))
# answer matrix
answer = []
for i in range(len(test_y)):
if test_y[i] == [0,1]:
answer.append(1)
elif test_y[i]==[1,0]:
answer.append(0)
answer = np.array(answer)
printResultCompare(result,answer)
# calculate accuracy
correct_prediction = tf.equal(tf.argmax(Ylogits, 1), tf.argmax(Y_C, 1))
print(Ylogits.eval(feed_dict={x: test_x}).shape)
train_neural_network(x)
#test_loaded_neural_network(save_path)
那么,任何人都可以帮助我们指出正确的地方进行编辑以解决我们的问题吗? (即我们需要编辑的文件的名称是什么,以及它位于何处。)提前致谢!
-gt -
答案 0 :(得分:0)
您应该在train_neural_network(x)
功能中添加这些代码。
ratio = (num of classes 1) / ((num of classes 0) + (num of classes 1))
class_weight = tf.constant([[ratio, 1.0 - ratio]])
Ylogits = neural_network_model(x)
weight_per_label = tf.transpose( tf.matmul(Y_C , tf.transpose(class_weight)) )
cross_entropy = tf.reduce_mean( tf.mul(weight_per_label, tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_C) ) )
optimizer = tf.train.AdamOptimizer(lr)
train_step = optimizer.minimize(cross_entropy)
而不是这些行:
Ylogits = neural_network_model(x)
# measure the error use build in cross entropy function, the value that we want to minimize
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_C))
# To optimize our cost (cross_entropy), reduce error, default learning_rate is 0.001, but you can change it, this case we use default
# optimizer = tf.train.GradientDescentOptimizer(0.003)
optimizer = tf.train.AdamOptimizer(lr)
train_step = optimizer.minimize(cross_entropy)
因为在神经网络中,我们计算相对于目标的预测误差(真实标签),在你的情况下,你使用交叉熵误差,它发现目标总和多个预测概率的Log。
网络优化器反向传播以最小化错误以实现更高的准确性。
如果没有加权损失,每个类的权重都是等于,所以优化程序减少了更多数量的类的错误,忽略了其他类。
因此,为了防止这种现象,我们应该强制优化器为少量的类反向提出更大的错误,为此我们应该将错误乘以标量。
我希望它有用:)