Question

我正在尝试对声音文件进行天真的音量调节。我正在使用python 2.7和以下

库：

import numpy as np

import scipy.io.wavfile as wv

import matplotlib.pyplot as plt

import pyaudio  

import wave

我尝试了2种方法，我试图通过2放大声音，即。 n = 2的。第一个是改变的动态范围限制器方法（http://bastibe.de/2012-11-02-real-time-signal-processing-in-python.html）：

def limiter(self, n):

    #best version so far

    signal=self.snd_array

    attack_coeff = 0.01

    framemax=2**15-1

    threshold=framemax

    for i in np.arange(len(signal)):

    #if amplitude value * amplitude gain factor is > threshold set an interval to decrease the amplitude            

        if signal[i]*n > threshold:

            gain=1

            jmin=0

            jmax=0                

            if i-100>0: 

                jmin=i-100

            else:

                jmin=0

            if i+100<len(signal):

                jmax=i+100

            else:

                jmax=len(signal)

            for j in range(jmin,jmax):    

                #target gain is amplitude factor times exponential to smoothly decrease the amp factor (n)

                target_gain = n*np.exp(-10*(j-jmin))

                gain = (gain*attack_coeff + target_gain*(1-attack_coeff))

                signal[j]=signal[j]*gain

        else:

            signal[i] = signal[i]*n

    print max(signal),min(signal)

    plt.figure(3)

    plt.plot(signal)

    return signal

第二种方法是我进行硬拐点压缩以降低声音值超过阈值的幅度，然后我通过幅度增益因子放大整个信号。

def compress(self,n):

     print 'start compress'

     threshold=2**15/n+1000

     #compress all values above the threshold, therefore limiting the audio amplitude range

     for i in np.arange(len(self.snd_array)):         

         if abs(self.snd_array[i])>threshold:

             factor=1+(threshold-abs(self.snd_array[i]))/threshold

         else:

             factor=1.0

     #apply compression factor and amp gain factor (n)

         self.snd_array[i] = self.snd_array[i]*factor*n

     print np.min(self.snd_array),np.max(self.snd_array)

     plt.figure(2)

     plt.plot(self.snd_array,'k')

     return self.snd_array

在这两种方法中，文件都会失真。在幅度接近阈值的点处，音乐声音被剪裁和裂缝。我认为这是因为它在接近阈值时“变平”了。我尝试在限制器功能中应用指数但它不会完全消除噼里啪啦的声音，即使我让它很快降低。如果我改变n = 1.5，声音不会失真。如果有人能给我任何关于如何去除噼啪声失真或指向其他音量调制代码的指示，那将非常感激。

Answer 1

它可能不是100％的主题，但也许这对你来说很有意思。如果您不需要进行实时处理，可以使事情变得更加容易。限制和动态压缩可视为应用动态传递函数。此函数仅将输入映射到输出值。线性函数然后返回原始音频和曲线＆＃34;功能会压缩或扩展。应用传递函数就像

一样简单

import numpy as np
from scipy.interpolate import interp1d
from scipy.io import wavfile

def apply_transfer(signal, transfer, interpolation='linear'):
    constant = np.linspace(-1, 1, len(transfer))
    interpolator = interp1d(constant, transfer, interpolation)
    return interpolator(signal)

限制或压缩只是选择不同传递函数的情况：

# hard limiting
def limiter(x, treshold=0.8):
    transfer_len = 1000
    transfer = np.concatenate([ np.repeat(-1, int(((1-treshold)/2)*transfer_len)),
                                np.linspace(-1, 1, int(treshold*transfer_len)),
                                np.repeat(1, int(((1-treshold)/2)*transfer_len)) ])
    return apply_transfer(x, transfer)

# smooth compression: if factor is small, its near linear, the bigger it is the
# stronger the compression
def arctan_compressor(x, factor=2):
    constant = np.linspace(-1, 1, 1000)
    transfer = np.arctan(factor * constant)
    transfer /= np.abs(transfer).max()
    return apply_transfer(x, transfer)

此示例假设16位单声道wav文件作为输入：

sr, x = wavfile.read("input.wav")
x = x / np.abs(x).max() # x scale between -1 and 1

x2 = limiter(x)
x2 = np.int16(x2 * 32767)
wavfile.write("output_limit.wav", sr, x2)

x3 = arctan_compressor(x)
x3 = np.int16(x3 * 32767)
wavfile.write("output_comp.wav", sr, x3)

也许这个干净的离线代码可以帮助您对实时代码进行基准测试。

如何在Python中

1 个答案: