我正在寻找一种快速的,最好是标准的库机制来确定wav文件的位深度,例如: '16位'或'24位'。
我正在使用对Sox的子进程调用来获取大量的音频元数据,但是子进程调用非常慢,而且我目前只能从Sox中可靠地获得的唯一信息是位深度。
内置wave模块没有“getbitdepth()”这样的功能,也不兼容24bit wav文件 - 我可以使用'try except'来使用wave模块访问文件元数据(如果有效) ,手动记录它是16位)然后打开除了调用sox(其中sox将执行分析以准确记录其bitdepth)。我担心的是,这种方法就像猜测一样。如果读取8位文件怎么办?如果不是,我会手动分配16位。
SciPy.io.wavefile也与24位音频不兼容,因此会产生类似的问题。
这个tutorial非常有趣,甚至包括一些非常低级别(至少是Python的低级别)脚本示例,用于从wav文件头中提取信息 - 遗憾的是这些脚本不适用于16位音频
有没有办法简单地(并且没有调用sox)确定我正在检查的wav文件的位深度?
我正在使用的wave header parser脚本如下:
import struct
import os
def print_wave_header(f):
'''
Function takes an audio file path as a parameter and
returns a dictionary of metadata parsed from the header
'''
r = {} #the results of the header parse
r['path'] = f
fin = open(f,"rb") # Read wav file, "r flag" - read, "b flag" - binary
ChunkID=fin.read(4) # First four bytes are ChunkID which must be "RIFF" in ASCII
r["ChunkID"]=ChunkID
ChunkSizeString=fin.read(4) # Total Size of File in Bytes - 8 Bytes
ChunkSize=struct.unpack('I',ChunkSizeString) # 'I' Format is to to treat the 4 bytes as unsigned 32-bit inter
TotalSize=ChunkSize[0]+8 # The subscript is used because struct unpack returns everything as tuple
r["TotalSize"]=TotalSize
DataSize=TotalSize-44 # This is the number of bytes of data
r["DataSize"]=DataSize
Format=fin.read(4) # "WAVE" in ASCII
r["Format"]=Format
SubChunk1ID=fin.read(4) # "fmt " in ASCII
r["SubChunk1ID"]=SubChunk1ID
SubChunk1SizeString=fin.read(4) # Should be 16 (PCM, Pulse Code Modulation)
SubChunk1Size=struct.unpack("I",SubChunk1SizeString) # 'I' format to treat as unsigned 32-bit integer
r["SubChunk1Size"]=SubChunk1Size
AudioFormatString=fin.read(2) # Should be 1 (PCM)
AudioFormat=struct.unpack("H",AudioFormatString) ## 'H' format to treat as unsigned 16-bit integer
r["AudioFormat"]=AudioFormat[0]
NumChannelsString=fin.read(2) # Should be 1 for mono, 2 for stereo
NumChannels=struct.unpack("H",NumChannelsString) # 'H' unsigned 16-bit integer
r["NumChannels"]=NumChannels[0]
SampleRateString=fin.read(4) # Should be 44100 (CD sampling rate)
SampleRate=struct.unpack("I",SampleRateString)
r["SampleRate"]=SampleRate[0]
ByteRateString=fin.read(4) # 44100*NumChan*2 (88200 - Mono, 176400 - Stereo)
ByteRate=struct.unpack("I",ByteRateString) # 'I' unsigned 32 bit integer
r["ByteRate"]=ByteRate[0]
BlockAlignString=fin.read(2) # NumChan*2 (2 - Mono, 4 - Stereo)
BlockAlign=struct.unpack("H",BlockAlignString) # 'H' unsigned 16-bit integer
r["BlockAlign"]=BlockAlign[0]
BitsPerSampleString=fin.read(2) # 16 (CD has 16-bits per sample for each channel)
BitsPerSample=struct.unpack("H",BitsPerSampleString) # 'H' unsigned 16-bit integer
r["BitsPerSample"]=BitsPerSample[0]
SubChunk2ID=fin.read(4) # "data" in ASCII
r["SubChunk2ID"]=SubChunk2ID
SubChunk2SizeString=fin.read(4) # Number of Data Bytes, Same as DataSize
SubChunk2Size=struct.unpack("I",SubChunk2SizeString)
r["SubChunk2Size"]=SubChunk2Size[0]
S1String=fin.read(2) # Read first data, number between -32768 and 32767
S1=struct.unpack("h",S1String)
r["S1"]=S1[0]
S2String=fin.read(2) # Read second data, number between -32768 and 32767
S2=struct.unpack("h",S2String)
r["S2"]=S2[0]
S3String=fin.read(2) # Read second data, number between -32768 and 32767
S3=struct.unpack("h",S3String)
r["S3"]=S3[0]
S4String=fin.read(2) # Read second data, number between -32768 and 32767
S4=struct.unpack("h",S4String)
r["S4"]=S4[0]
S5String=fin.read(2) # Read second data, number between -32768 and 32767
S5=struct.unpack("h",S5String)
r["S5"]=S5[0]
fin.close()
return r
答案 0 :(得分:4)
我强烈推荐soundfile模块(但请注意,我非常偏颇,因为我写了很多部分)。
在那里,您可以将文件作为soundfile.SoundFile对象打开,该对象具有subtype属性,用于保存您要查找的信息。
在您的情况下可能是'PCM_16'
或'PCM_24'
。
答案 1 :(得分:2)
基本上与Matthias的答案相同,但具有可复制复制的代码。
pip install soundfile
ob = sf.SoundFile('example.wav')
print('Sample rate: {}'.format(ob.samplerate))
print('Channels: {}'.format(ob.channels))
print('Subtype: {}'.format(ob.subtype))
PCM_16
表示16位深度,其中PCM代表Pulse-Code Modulation。如果您只寻找命令行工具,那么我可以推荐MediaInfo:
$ mediainfo example.wav
General
Complete name : example.wav
Format : Wave
File size : 83.2 MiB
Duration : 8 min 14 s
Overall bit rate mode : Constant
Overall bit rate : 1 411 kb/s
Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 8 min 14 s
Bit rate mode : Constant
Bit rate : 1 411.2 kb/s
Channel(s) : 2 channels
Sampling rate : 44.1 kHz
Bit depth : 16 bits
Stream size : 83.2 MiB (100%)