如何在tensorflow中批处理可变长度的频谱图

时间:2018-12-31 06:20:59

标签: python-3.x tensorflow tensorflow-datasets

我必须训练一个去噪自动编码器,但是我需要用1帧干净的powerspectrum来批处理5帧噪声powerspectrum,但是由于我的数据在时间序列上都是可变长度的,所以我不打算批处理频谱图。

def parse_line(noise_file,clean_file):
    noise_binary = tf.read_file(noise_file)
    noise_binary = tf.contrib.ffmpeg.decode_audio(noise_binary, file_format='wav', samples_per_second=16000, channel_count=1)
    noise_stfts = tf.contrib.signal.stft(tf.reshape(noise_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
    noise_powerspectrum = tf.log(tf.abs(noise_stfts)**2)
    noise_data = tf.squeeze(tf.contrib.signal.frame(noise_powerspectrum,frame_length=5,frame_step=1,axis=1))
    clean_binary = tf.read_file(clean_file)
    clean_binary = tf.contrib.ffmpeg.decode_audio(clean_binary, file_format='wav', samples_per_second=16000, channel_count=1)
    clean_stfts = tf.contrib.signal.stft(tf.reshape(clean_binary, [1, -1]), frame_length=512, frame_step=256,fft_length=512)
    clean_powerspectrum = tf.log(tf.abs(clean_stfts)**2)
    clean_data = tf.squeeze(clean_powerspectrum)[:-4]
    return noise_data, clean_data

我的tf.data管道如下所示

shuffle_batch = 10
batch_size = 10
dataset = tf.data.Dataset.from_tensor_slices((noise_datalist,clean_datalist))
dataset = dataset.shuffle(shuffle_batch) # shuffle number of files perbatch
dataset = dataset.map(parse_line,num_parallel_calls=8)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)
dataset = dataset.make_one_shot_iterator()
next_element = dataset.get_next()

这是显示的错误

InvalidArgumentError (see above for traceback): Cannot batch tensors with different shapes in component 0. First element had shape [443,5,257] and element 1 had shape [280,5,257].
 [[{{node IteratorGetNext}} = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

当我将batch_size更改为1时,它将起作用并获得一个数据。我该如何批处理此可变长度数据,甚至可能将所有数据批处理为1,例如[443,5,257]和[280,5,257]到[723,5,257]?

0 个答案:

没有答案