Question

我正在读取文件的音频数据为float，例如，我得到了这些值：

-4,151046E+34 
-2,365558E+38 
6,068741E+26 
-4,141856E+34 
-2,179363E+38 
 1,177772E-04 
-1,035052E+34 
-1,#QNAN 
 2,668123E-20 
-1,0609E+37 
-2,153349E+38 
 1,105884E-16 
-4,25223E+37 
-1,#QNAN 
-3,718855E+22 
-1,695596E+38

我想检测沉默何时开始和结束。

这些值代表的是与音量直接相关的值，还是0代表当前屏幕快照中的值，我需要查看其中的很多值以检测静音？

Answer 1

沉默是与感知联系在一起的概念，它具有时间属性...沉默不可能在被大声音频包围的瞬间发生，因为它不会被感知为沉默

当音频曲线在零交叉点处或与零交叉点相差不大且持续一段可感知的时间段时，就会发生静默……您无法听得到可听见的音频，然后是沉默，而沉默仅持续一会儿，然后是听得到的音频...这不是沉默...您的耳膜或静音房间中的麦克风膜片不会振动...随着房间的响度从静音中增加，这些表面开始摆动...您所显示的情节可以被认为是可视化的摆动...在情节上，唯一的寂静发生在开始的那段平坦时间段内

要以编程方式识别何时静音，您需要两个参数

音频曲线的某些最大高度，您在该高度以下声明保持静音
音频曲线保持在最大高度以下的最短时间长度

您可以尝试猜测这些值...现在让我们确定何时寂静

package main

import "fmt"

func main() {

    //  somehow your audio_buffer gets populated

    flag_in_candidate_silence := false          //  current sample is quiet
    flag_currently_in_declared_silence := false //  current stretch of samples are in silence period

    total_num_samples := len(audio_buffer) // identify how many samples

    max_vol := 0.1        //  max volume and still a silence candidate
    min_num_samples := 2000 //  minimum number of samples necessary to declare silence has happened
                            //  value used is dependent on sampling rate

    curr_num_samples_found := 0

    index_silence_starts := 0
    index_silence_ends := 0

    for curr_sample := 0; curr_sample < total_num_samples; curr_sample++ {

        curr_amplitude := audio_buffer[curr_sample]

        if curr_amplitude < max_vol { // current sample is candidate for silence

            index_silence_ends = curr_sample

            if flag_in_candidate_silence != true { // previous sample was not a candidate

                index_silence_starts = curr_sample
            }

            if curr_num_samples_found > min_num_samples {

                //  we are inside a period of silence !!!!!!!!!!!

                flag_currently_in_declared_silence = true
            }

            flag_in_candidate_silence = true
            curr_num_samples_found++ //  increment counter of current stretch of silence candidates

        } else {

            if flag_currently_in_declared_silence == true {

                fmt.Println("found silence stretch of samples from ", index_silence_starts, " to ", index_silence_ends)
            }

            flag_in_candidate_silence = false
            flag_currently_in_declared_silence = false
            curr_num_samples_found = 0
        }
    }

    if flag_currently_in_declared_silence == true {

        fmt.Println("found silence stretch of samples from ", index_silence_starts, " to ", index_silence_ends)
    }
}

（未经测试的代码-直接从额头喷出）

从浮动音频值检测静音

1 个答案: