我有以下script:
# Run this in the directory containing Glossika GMS-C mp3 files, such
# as for example ENFR-F1-GMS-C-0051.MP3.
# Uses mp3splt which should be available.
# Will REMOVE all existing files of the form ???.mp3 in the current directory,
# because it uses them for temporary storage, beware.
# Questions, suggestions, etc.:
# if you're a power user or a programmer, use issues/pull requests on the
# Gitlab project, https://gitlab.com/avorobey/glossika-tools, or email as
# you feel appropriate.
# if you're not, email me at avorobey@gmail.com.
import glob
import os
import re
import subprocess
import sys
files = glob.glob("ENAR-F1-GMS-C-????.mp3")
print "found: ", len(files), " files."
for file in sorted(files):
print "processing: ", file
result = re.match("^EN(.{2,4})-..-GMS-C-(\d\d\d\d).mp3$", file)
if not result:
print "unmatched: ", file
continue
lang = result.group(1).lower()
fr = int(result.group(2))
to = fr+49
# print "from: ", fr, ", to: ", to
# removing existing split files
for f in glob.glob("???.mp3"):
os.remove(f)
# running mp3splt
devnull = open(os.devnull, 'w')
# Documentations of some important mp3splt options I use and why:
# -n to not write tags in target files. Makes them easily concatenatable.
# -x to not write the Xing header in target files. Same reason as -n.
# -p parameters:
# min-1.9 to detect 1.9 seconds or more as silence. Normally silence between
# sentences as found by mp3splt is just under 3 seconds, but 1.9 also
# catches the pause between some standard phrases at the beginning of a
# file.
# th=-64 to set the threshold for detecting silence at -64 DB. The default
# value of -48 makes mp3splt miss some of the silences, I don't know why.
# shots=12 to set the threshold for detecting non-silence. With the default
# value of 25 some very short sentences are processed as part of silence.
# rm=0.1_0 to remove silence between tracks, but leave 0.1 seconds at
# the beginning of a track. Without this 0.1, just with "rm", I find that
# sometimes mp3splt cuts off a little bit of real sound by mistake.
# Comparing with audacity, I see that silence intervals written into
# mp3split.log are detected correctly, but for some reason when mp3splt
# tries to find the position of the end of the silence interval, it
# overshoots by quite a bit. I suspect a deeply placed bug in how mp3splt
# processes mpeg frames, and its code is NOT fun to read or debug, so this
# is the workaround I found for now.
status = subprocess.call(["mp3splt", "-n", "-x", "-s",
"-p", "rm=0.1_0,min=1.9,th=-64,shots=12", "-o", "@N3", file],
stdout=devnull, stderr=devnull)
if status != 0:
sys.exit("bad status running mp3splt")
split = glob.glob("???.mp3")
if len(split) != 54:
sys.exit("surprising number of files: " + str(len(split)) + " for: " + file)
# use files from 003.mp3 to 052.mp3 inclusive.
split = sorted(split)[2:52]
# start at from-1, so for example 1st sentence becomes fr-0000.mp3
cnt = fr-1
for f in split:
name = "{0}-{1:04d}.mp3".format(lang, cnt)
cnt += 1
os.rename(f, name)
print "finished with: ", file
# cleanup
for f in glob.glob("???.mp3"):
os.remove(f)
os.remove("mp3splt.log")
并收到以下错误:
found: 20 files.
processing: ENAR-F1-GMS-C-0001.mp3
Traceback (most recent call last):
File "glossika-split.py", line 59, in <module>
stdout=devnull, stderr=devnull)
File "/usr/local/Cellar/python/2.7.14/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 168, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/local/Cellar/python/2.7.14/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 390, in __init__
errread, errwrite)
File "/usr/local/Cellar/python/2.7.14/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1025, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
我不确定这究竟是什么,因为我不是python程序员,但我试图让这个脚本正常工作。我尝试使用python script.py
以及python2 script.py
来调用它。我觉得stdout
设置不正确?但我不确定。一些帮助将非常感谢!
答案 0 :(得分:2)
找不到文件参考mp3splt。 首先,尝试从命令行直接调用mp3splt。 其次,尝试添加mp3splt的完整路径。