我正在尝试使用Python读取Telegram发送的语音消息,但是对于简短的语音剪辑(<10秒),它不起作用。由于某种原因,它缩短了持续时间。似乎与OGG codec
有关,但是我不确定。
请看下面的代码,语音剪辑大约是6秒钟,但是pydub
读取6秒钟的语音剪辑就是0.06秒。
import telegram
from pydub import AudioSegment
AudioSegment.ffmpeg = "./dependencies/ffmpeg-20180802-c9118d4-win64-static/bin/ffmpeg"
AudioSegment.converter = "./dependencies/ffmpeg-20180802-c9118d4-win64-static/bin/ffmpeg"
bot = telegram.Bot(token=token)
f = bot.get_file(file_id)
f.download('output/voiceclips/{}.ogg'.format(file_id))
myaudio = AudioSegment.from_ogg("output/voiceclips/{}.ogg".format(file_id))
print('ID: {}, which is {} seconds'.format(file_id, myaudio.duration_seconds))
>>> ID: ______, which is 0.06 seconds
当我在VLC-player
中打开文件时,它还指出该文件有0秒。当我尝试使用FFmpeg将其转换为WAV文件时,它会将ogg文件读取为6秒,但是将其写入为0.05秒WAV文件。
ffmpeg -i infile.ogg outfile.wav
ffmpeg version N-91549-gc9118d4d64 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 7.3.1 (GCC) 20180722
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth
libavutil 56. 18.102 / 56. 18.102
libavcodec 58. 22.100 / 58. 22.100
libavformat 58. 17.101 / 58. 17.101
libavdevice 58. 4.101 / 58. 4.101
libavfilter 7. 26.100 / 7. 26.100
libswscale 5. 2.100 / 5. 2.100
libswresample 3. 2.100 / 3. 2.100
libpostproc 55. 2.100 / 55. 2.100
[ogg @ 0000020dd375ad40] 727 bytes of comment header remain
Input #0, ogg, from 'infile.ogg':
Duration: 00:00:06.03, start: 0.000000, bitrate: 20 kb/s
Stream #0:0: Audio: opus, 48000 Hz, mono, fltp
Stream mapping:
Stream #0:0 -> #0:0 (opus (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'outfile.wav':
Metadata:
ISFT : Lavf58.17.101
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, mono, s16, 768 kb/s
Metadata:
encoder : Lavc58.22.100 pcm_s16le
size= 6kB time=00:00:00.05 bitrate= 873.0kbits/s speed=4.12x
video:0kB audio:6kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.354167%
对于较大的文件,它可以工作!
答案 0 :(得分:0)
找到了解决方案!我们使用了有效的opustools https://opus-codec.org/docs/。现在,在为下载文件夹设置了环境路径之后,opusdec infile.ogg outfile.wav确实适用于短语音消息