我有一个Django网络应用程序,接受用户上传的视频/音频并将其保存到文件夹中 ../ WebAppDirectory / media / recordings '。
然后我使用语音文本API来获得音频的粗略转录。这适用于.wav和.mp4文件,但Web应用程序也接受我想首先转换为.wav的视频(.MOV),然后传递给API。
在我的命令行中使用ffmpeg
ffmpeg -i C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_sample.MOV -ab 160k -ac 2 -ar 44100 -vn upload_sample.wav
从原始.MOV正确创建.wav文件。
然而,当我用python运行
时subprocess.check_call(command, shell=True)
ffmpeg以
回应文件' upload_sample.wav'已经存在。覆盖? [Y / N]
虽然Python告诉我
FileNotFoundError:[Errno 2]没有这样的文件或目录:' C:\ Users \ Nathan \ Desktop \ MeetingRecorderWebAPP \ media \ recordings \ upload_sample.wav'
同样值得注意的是,我没有看到“upload_sample.wav”#39;文件在 media / recordings / 目录中。
这让我相信可能Python和ffmpeg正在寻找不同的文件夹,但我不确定我哪里出错了。当我从subprocess.check_call打印命令并将其复制/粘贴到cmd时,将按预期创建该文件。
希望对ffmpeg / Python子进程有一定经验的人可以帮助解决一些问题!以下是我正在使用的文件:
文件夹结构
DjangoWebApp
|---media
|---|---imgs
|---|---recordings
|---|---|---upload_sample.MOV
|---uploaded_audio_to_text.py
uploaded_audio_to_text.py
import speech_recognition as sr
from os import path
import os
import subprocess
def speech_to_text(file_name):
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), 'media','recordings', file_name)
print("Looking at path: ",AUDIO_FILE)
# get extension
AUDIO_FILE_EXT = os.path.splitext(AUDIO_FILE)[1]
if(AUDIO_FILE_EXT == '.MOV'):
print("File is not .wav: ", AUDIO_FILE_EXT, "found. Converting...")
# We will use subprocess and ffmpeg to convert this .MOV file to .wav, so we can send to API
temp_wav = os.path.splitext(file_name)[0] + '.wav'
print("New audio file will be: ", temp_wav)
# build CMD ffmpeg command
command = "ffmpeg -i "
command += AUDIO_FILE
command += " -ab 160k -ac 2 -ar 44100 -vn "
command += temp_wav
print("Attempting to run this command: \n",command)
print(subprocess.check_call(command, shell=True))
print("Past Subprocess.call")
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), 'media','recordings', temp_wav)
print("AUDIO_FILE now set to: ", AUDIO_FILE)
else:
# continue with what we are doing
pass
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source) # read the entire audio file
text_transcription = "Sentinel"
# recognize speech using Microsoft Bing Voice Recognition
BING_KEY = "MY_KEY_:)"
try:
text_transcription = r.recognize_bing(audio, key=BING_KEY)
except sr.UnknownValueError:
print("Microsoft Bing Voice Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Microsoft Bing Voice Recognition service; {0}".format(e))
return text_transcription
#my tests
my_relative_file_path = "upload_sample.MOV"
print(speech_to_text(my_relative_file_path))
控制台输出(追溯和我的打印()&#39>
Looking at path: C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_sample.MOV
File is not .wav: .MOV found. Converting...
New audio file will be: upload_sample.wav Attempting to run this command:
ffmpeg -i C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_sample.MOV -ab 160k -ac 2 -ar 44100 -vn upload_sample.wav
ffmpeg version git-2017-12-18-74f408c Copyright (c) 2000-2017 the FFmpeg developers built with gcc 7.2.0 (GCC)
----REMOVED SOME FFMPEG OUTPUT FOR BREVITY----
File 'upload_sample.wav' already exists. Overwrite ? [y/N] y
Stream mapping: Stream #0:1 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'upload_sample.wav': Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
com.apple.quicktime.creationdate: 2017-12-19T16:06:10-0500
com.apple.quicktime.make: Apple
com.apple.quicktime.model: iPhone 6
com.apple.quicktime.software: 10.3.3
ISFT : Lavf58.3.100
Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default)
Metadata:
creation_time : 2017-12-19T21:06:11.000000Z
handler_name : Core Media Data Handler
encoder : Lavc58.8.100 pcm_s16le size= 1036kB time=00:00:06.01 bitrate=1411.3kbits/s speed=N/A video:0kB audio:1036kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.007352%
0
Traceback (most recent call last): Past Subprocess.call
File "C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\uploaded_audio_to_text.py", line 53, in <module>
AUDIO_FILE now set to: C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_sample.wav
print(speech_to_text(my_relative_file_path))
File "C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\uploaded_audio_to_text.py", line 36, in speech_to_text
with sr.AudioFile(AUDIO_FILE) as source:
File "C:\Users\Nathan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\speech_recognition\__init__.py", line 203, in __enter__
self.audio_reader = wave.open(self.filename_or_fileobject, "rb")
File "C:\Users\Nathan\AppData\Local\Programs\Python\Python36-32\lib\wave.py", line 499, in open
return Wave_read(f)
File "C:\Users\Nathan\AppData\Local\Programs\Python\Python36-32\lib\wave.py", line 159, in __init__
f = builtins.open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Nathan\\Desktop\\MeetingRecorderWebAPP\\media\\recordings\\upload_sample.wav'
Process finished with exit code 1
答案 0 :(得分:0)
aergistal建议的正确答案。我需要设置输出文件的绝对路径!
更改了命令
ffmpeg -i C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_sample.MOV -ab 160k -ac 2 -ar 44100 -vn upload_sample.wav
到新命令
ffmpeg -y -i C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_test_2.MOV -ab 160k -ac 2 -ar 44100 -vn C:\Users\Nathan\Desktop\MeetingRecorderWebAPP\media\recordings\upload_test_2.wav