Question

从DNA序列中，我每个碱基（A，C，G，T，N）的一个热编码为

{'A': [1, 0, 0, 0, 0],
 'C': [0, 1, 0, 0, 0],
 'G': [0, 0, 1, 0, 0],
 'T': [0, 0, 0, 1, 0],
 'N': [0, 0, 0, 0, 1]}

每个DNA序列都有400个碱基。所以我的最终训练数据的形状为X_train.shape = (111453, 400, 5)（111453行，400个字母，每个字母都编码为5个元素向量）

我的标签数据是简单的是/否，因此如果DNA序列有错误，则为[1]，否则为[0]，因此Y_train.shape is (111453,1)

我正在尝试使用张量流构建一个小型NN。

layer_1_nodes = 5
layer_2_nodes = 10
layer_3_nodes = 5
learning_rate = 0.001
training_epochs = 5
number_of_outputs = 1

# Input Layer
with tf.variable_scope('input'):
    X = tf.placeholder(tf.float32, shape=(None, 400, 5), name="X")

# Layer 1
with tf.variable_scope('layer_1'):
    weights = tf.get_variable("weights1", shape=[400, 5, layer_1_nodes], initializer=tf.contrib.layers.xavier_initializer())
    biases = tf.get_variable(name="biases1", shape=[layer_1_nodes], initializer=tf.zeros_initializer())
    layer_1_output = tf.nn.relu(tf.matmul(X, weights) + biases)


# Layer 2
with tf.variable_scope('layer_2'):
    weights = tf.get_variable("weights2", shape=[400, layer_1_nodes, layer_2_nodes], initializer=tf.contrib.layers.xavier_initializer())
    biases = tf.get_variable(name="biases2", shape=[layer_2_nodes], initializer=tf.zeros_initializer())
    layer_2_output = tf.nn.relu(tf.matmul(layer_1_output, weights) + biases)

# Layer 3
with tf.variable_scope('layer_3'):
    weights = tf.get_variable("weights3", shape=[400, layer_2_nodes, layer_3_nodes], initializer=tf.contrib.layers.xavier_initializer())
    biases = tf.get_variable(name="biases3", shape=[layer_3_nodes], initializer=tf.zeros_initializer())
    layer_3_output = tf.nn.relu(tf.matmul(layer_2_output, weights) + biases)

with tf.variable_scope('layer_drop'):    
  dropout = tf.layers.dropout(
      inputs=layer_3_output, rate=0.4)

# Output Layer
with tf.variable_scope('output'):
    weights = tf.get_variable("weights4", shape=[400, layer_3_nodes, number_of_outputs], initializer=tf.contrib.layers.xavier_initializer())
    biases = tf.get_variable(name="biases4", shape=[number_of_outputs], initializer=tf.zeros_initializer())
    prediction = tf.matmul(dropout, weights) + biases

with tf.variable_scope('cost'):
    Y = tf.placeholder(tf.float32, shape=(None, 1), name="Y")
    cost = tf.reduce_mean(tf.squared_difference(prediction, Y))

但是我总是会遇到关于张量形状的错误。第1层中的第一个matmul或成本squared_difference（）(Incompatible shapes: [400,400,1] <- the prediction tensor vs. [111453,1])

尝试使用keras模型：

input_shape = (400, 5, 1)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))

但是我无法正确输入形状。

无法为tf.matmul或tf.squared_difference

0 个答案: