我正在尝试为棒球建立一个神经网络,可以检测球的位置以及当球场穿过板时击球区的位置,但是我的神经网络似乎卡在局部最小值中,它返回相同的值对于数据集中的每个项目。
我的方法是在球接近盘子时采取34帧,并用它来检测球何时穿过盘子。
输出为ball_left,ball_top,ball_width,strike_zone_left,strike_zone_top,strike_zone_width,strike_zone_height,frame_where_ball_crossed_plate。
我的神经网络模型是在每个帧上进行卷积,但最后不是完全连接的层,我使用LSTM,以便神经网络可以从之前的帧推断出事物。我需要从之前的帧中推断,因为有时球不会被看到,因为投手在它前面或因为它在捕手手套中。
观察一段时间内的成本,神经网络似乎停留在局部最小值上,并为训练集中的每个音高产生相同的结果。
这是我的代码。
filter_size1 = 5 # Convolution filters are 5 x 5 pixels.
num_filters1 = 16 # There are 16 of these filters.
filter_size2 = 5 # Convolution filters are 5 x 5 pixels.
num_filters2 = 36 # There are 36 of these filters.
filter_size3 = 5 # Convolution filters are 5 x 5 pixels.
num_filters3 = 36 # There are 36 of these filters.
num_hidden = 256
lstm_layers = 2
num_channels = 1
num_classes = 10
width = 320
height = 180
sequence_length = Directories.Pitch_Sequence_Length
x = tf.placeholder(tf.float32, shape=[None, sequence_length, width, height], name='x')
keep_prob = tf.placeholder(tf.float32, name="keep_prob")
x_image = tf.reshape(x, [-1, width, height, num_channels])
y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')
layer_conv1, weights_conv1, biases_conv1 = ConvolutionNeuralNetwork.new_conv_layer(input=x_image,
num_input_channels=num_channels,
filter_size=filter_size1,
num_filters=num_filters1,
use_pooling=True)
layer_conv2, weights_conv2, biases_conv2 = ConvolutionNeuralNetwork.new_conv_layer(input=layer_conv1, num_input_channels=num_filters1,
filter_size=filter_size2, num_filters=num_filters2,
use_pooling=True)
layer_conv3, weights_conv3, biases_conv3 = ConvolutionNeuralNetwork.new_conv_layer(input=layer_conv2, num_input_channels=num_filters2,
filter_size=filter_size3, num_filters=num_filters3,
use_pooling=True)
layer_flat, num_features = ConvolutionNeuralNetwork.flatten_layer(layer_conv3)
fc_sequence = tf.reshape(layer_flat, [-1, sequence_length, int(layer_flat.shape[1])])
cell = tf.contrib.rnn.BasicLSTMCell(num_hidden)
cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=keep_prob)
cell = tf.contrib.rnn.MultiRNNCell([cell] * lstm_layers)
outputs, states = tf.contrib.rnn.static_rnn(cell, tf.unstack(tf.transpose(fc_sequence, perm=[1, 0, 2])), dtype=tf.float32)
self.y_pred = ConvolutionNeuralNetwork.new_fc_layer(outputs[-1], num_hidden, num_classes, use_relu=True)
self.cost = tf.reduce_mean(tf.pow(self.y_pred - y_true, 2))
self.optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(self.cost)
和类ConvolutionNeuralNetwork.py
import tensorflow as tf
def new_weights(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
def new_biases(length):
return tf.Variable(tf.constant(0.05, shape=[length]))
def new_conv_layer(input, num_input_channels, filter_size, num_filters, use_pooling=True, weights=None, biases=None):
shape = [filter_size, filter_size, num_input_channels, num_filters]
if weights is None:
weights = new_weights(shape=shape)
if biases is None:
biases = new_biases(length=num_filters)
layer = tf.nn.conv2d(input=input, filter=weights, strides=[1, 1, 1, 1], padding='SAME')
layer += biases
if use_pooling:
layer = tf.nn.max_pool(value=layer, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
layer = tf.nn.relu(layer)
return layer, weights, biases
def flatten_layer(layer):
layer_shape = layer.get_shape()
num_features = layer_shape[1:4].num_elements()
layer_flat = tf.reshape(layer, [-1, num_features])
return layer_flat, num_features
def flatten_layer_multiple(layer1, layer2):
layer = tf.concat([layer1, layer2], 1)
layer_shape = layer.get_shape()
num_features = layer_shape[1:4].num_elements()
layer_flat = tf.reshape(layer, [-1, num_features])
return layer_flat, num_features
def new_fc_layer(input, num_inputs, num_outputs, use_relu=True, keep_prob=None):
weights = new_weights(shape=[num_inputs, num_outputs])
biases = new_biases(length=num_outputs)
layer = tf.matmul(input, weights) + biases
if use_relu:
layer = tf.nn.relu(layer)
if keep_prob is not None:
tf.nn.dropout(layer, keep_prob)
return layer
以下是我的学习曲线(成本随时间变化)
以下是结果假设的样本。
这就是神经网络所预测的。
神经网络为攻击区域绘制该框,并且该框用于在完全相同的位置中每个球场的球的位置。谁能看到我做错了什么?