我正在研究一个从音频流中学习的tensorflow项目。我正在尝试打开一个音频文件,并使用FFMPEG将数据存储在一个数组中。我正在关注教程here
我的代码如下所示:
import subprocess as sp
FFMPEG_BIN = "ffmpeg"
try:
if image_file != 'train/rock/.DS_Store':
command = [FFMPEG_BIN,
'-i', image_file,
'-f', 's16le',
'-acodec', 'pcm_s16le',
'-ar', '44100',
'-ac', '2',
'output.png']
pipe = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)
# pipe = sp.Popen(command, stdout=sp.PIPE)
raw_audio = pipe.proc.stdout.read(88200*4)
但是我收到了错误:
AttributeError: 'Popen' object has no attribute 'proc'
答案 0 :(得分:1)
我正在使用ffmpeg
和pyaudio
。这段代码适合我。
import pyaudio
import subprocess as sp
import numpy
command = [ 'ffmpeg',
'-i', "Filename", # I used a url stream
'-loglevel','error',
'-f', 's16le',
'-acodec', 'pcm_s16le',
'-ar', '44100', # ouput will have 44100 Hz
'-ac', '2', # stereo (set to '1' for mono)
'-']
pipe = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)
p = pyaudio.PyAudio() #PyAudio helps to reproduce raw data in pipe.
stream = p.open(format = pyaudio.paInt16,
channels = 2,
rate = 44100,
output = True)
while True:
raw_audio = pipe.stdout.read(44100*2) #get raw data
stream.write(raw_audio) # reproduce
# Convert raw data in array with numpy
audio_array = numpy.fromstring(raw_audio, dtype="int16")
audio_array = audio_array.reshape((len(audio_array)/2,2))
stream.stop_stream()
stream.close()
在ubuntu中,您可以使用以下代码安装pyaudio
sudo apt-get install python-pyaudio python3-pyaudio
或
pip install pyaudio
答案 1 :(得分:0)
正如评论中所建议的,这对我有用:
import subprocess as sp
import numpy as np
command = [ 'ffmpeg',
'-i', 'song.mp3',
'-f', 's16le',
'-acodec', 'pcm_s16le',
'-ar', '22050',
'-ac', '1',
'-']
pipe = sp.Popen(command, stdout=sp.PIPE)
stdoutdata = pipe.stdout.read()
audio_array = np.fromstring(stdoutdata, dtype="int16")
我不太确定你为什么要尝试转换" image_file"进入" .png"使用音频转换?