Question

我知道目前有一些讨论该主题的好帖子（this one非常好而且非常详细），但是经过2个小时的奋斗，我仍然遇到一些问题：

仅出于某种上下文考虑：我正在获取一些wav文件的频谱图（16 kHz，3秒划分为20ms），然后尝试将其馈送到神经网络中，以查找它们是否包含具体单词（考虑到确定范围为0到1）。

def obtain_sample(wav):
    sample_rate, samples = wavfile.read(wav)
    frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate, nperseg=320, noverlap=16)
    dBS = 10 * np.log10(spectrogram)  # convert to dB

    return dBS

def create_model():
    print("Creating Model...")
    model= Sequential()
    model.add(Dense(10,input_shape=(161,157)))
    model.add(Activation('sigmoid'))

    model.compile(optimizer='rmsprop',
                  loss='binary_crossentropy',
                  metrics=['accuracy'])

    com1=obtain_sample("comando.wav")
    com2=obtain_sample("comando2.wav")
    nocom=obtain_sample("nocomando.wav")
    inputs=np.array([com1,com2,nocom])
    results=np.array([[1.],[1.],[0.]])
    model.fit(inputs,results,epochs=10,)
    #model.fit(com1,[1.],epochs=10)
    #model.fit(com2,[1.],epochs=10)
    #model.fit(nocom,[0.],epochs=10)

    model.save("modelo_comando")
    print("Model saved")

我实际上遇到以下错误：

ValueError('Error when checking target: expected activation_1 to have 3 dimensions, but got array with shape (3, 1)',)

大约一个小时后，在检查局部var值时试图更好地解释问题，我想我想问一下我是否给出了正确的输入形状，以及如何依次使用Flatten / Reshape层以获得每个样本的单个值输出？

对不起，我无法做到更具体

Answer 1

在“密集”之后添加一个“扁平化”层，在“扁平化”层之后添加一个“密集”层，其中单位数量应等于您期望的输出形状。在这种情况下，我们期望一个值。因此Dense(1)

inputs = np.random.rand(3,161,157)
model= Sequential()
model.add(Dense(10,input_shape=(161,157)))
model.add(Flatten())
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.summary()
results=np.array([[1.],[1.],[0.]])
model.fit(inputs,results,epochs=10)

我运行了上面的代码，没有任何问题。请检查这个

在模型上进行预测

# Since i don't have the original data, i am creating some random values
test = np.random.rand(161,157)
test = np.expand_dims(test,axis=0)
model.predict(test)

Keras中的输入整形和模型训练

1 个答案: