Question

^{注意：我已经阅读了Importing sound files into Python as NumPy arrays (alternatives to audiolab)，尝试了所有答案，包括那些需要Popen ffmpeg并从stdout管道读取内容的答案，等等。我也已经阅读了{{ 3}}等，并尝试了主要答案，但没有简单的解决方案。在花了几个小时之后，我在这里将其发布为“回答您自己的问题-以问答方式分享您的知识”。我也读过Trying to convert an mp3 file to a Numpy Array, and ffmpeg just hangs，但这并不容易涵盖多渠道案例等。}

是否有一种使用与{strong> How to create a numpy array from a pydub AudioSegment?和scipy.io.wavfile.read类似的API来向numpy数组中读写MP3音频文件的方法？ / p>

sr, x = wavfile.read('test.wav')
wavfile.write('test2.wav', sr, x)

注意：pydub的{{1}}对象不能直接访问numpy数组。

Answer 1

按照许多有关阅读MP3的文章中的建议，调用AudioSegment并手动解析其ffmpeg是一项繁琐的任务（很多情况下，因为可以使用不同数量的通道，等等），所以这里是使用stdout的有效解决方案（您首先需要pydub）。

此代码允许使用与pip install pydub 类似的API读取MP3到numpy数组/将numpy数组写入MP3文件：

scipy.io.wavfile.read/write

注意：

目前仅适用于16位文件（即使24位WAV文件非常普遍，我也很少见过24位MP3文件...是否存在？）

import pydub import numpy as np def read(f, normalized=False): """MP3 to numpy array""" a = pydub.AudioSegment.from_mp3(f) y = np.array(a.get_array_of_samples()) if a.channels == 2: y = y.reshape((-1, 2)) if normalized: return a.frame_rate, np.float32(y) / 2**15 else: return a.frame_rate, y def write(f, sr, x, normalized=False): """numpy array to MP3""" channels = 2 if (x.ndim == 2 and x.shape[1] == 2) else 1 if normalized: # normalized array - each item should be a float in [-1, 1) y = np.int16(x * 2 ** 15) else: y = np.int16(x) song = pydub.AudioSegment(y.tobytes(), frame_rate=sr, sample_width=2, channels=channels) song.export(f, format="mp3", bitrate="320k")允许使用浮点数组（[-1,1中的每个项目）

用法示例：

normalized=True

Answer 2

您可以使用 audio2numpy 库。安装

pip install audio2numpy

然后，您的代码将是：

import audio2numpy as a2n
x,sr=a2n.audio_from_file("test.mp3")

对于写作，请使用@Basj 的答案

如何将MP3音频文件读入numpy数组/如何将numpy数组保存到MP3？

2 个答案: