Question

（这是在Python中，代码会很棒，但我主要对算法感兴趣。）

我正在监听音频流（PyAudio）并寻找一系列5个弹出窗口（请参见底部的可视化）。我正在阅读（）获取流并获取我刚读过的块的RMS值（类似于this question）。我的问题是我不是在寻找单个事件，而是一系列具有某些特性但不像我想要的布尔值的事件（pop）。检测这五种流行音乐最直接（和高效）的方式是什么？

RMS函数给我一个这样的流：

0.000580998485254, 0.00045098391298, 0.00751436443973, 0.002733730043, 0.00160775708652, 0.000847808804511

如果我为你舍入（类似的流），它看起来会更有用：

0.001, 0.001, 0.018, 0.007, 0.003, 0.001, 0.001

你可以看到第3项中的弹出，大概是因为它在第4项中安静下来，也许尾部是在第5项的一小部分中。

我想连续检测其中的5个。

我天真的做法是： a）定义pop是什么：Block的RMS超过.002。至少2个区块但不超过4个区块。从沉默开始，以沉默结束。

另外，我很想定义什么是静音（忽略不太响的但不是很安静的块，但是我不确定这更有意义，然后考虑'pop'是布尔值）。

b）然后有一个状态机跟踪一堆变量，并有一堆if语句。像：

while True:
  is_pop = isRMSAmplitudeLoudEnoughToBeAPop(stream.read())

  if is_pop:
    if state == 'pop':
      #continuation of a pop (or maybe this continuation means
      #that it's too long to be a pop
      if num_pop_blocks <= MAX_POP_RECORDS:
        num_pop_blocks += 1
      else:
        # too long to be a pop
        state = 'waiting'
        num_sequential_pops = 0
    else if state == 'silence':
      #possible beginning of a pop
      state = 'pop'
      num_pop_blocks += 1
      num_silence_blocks = 0
  else:
    #silence
    if state = 'pop':
      #we just transitioned from pop to silence
      num_sequential_pops += 1

      if num_sequential_pops == 5:
        # we did it
        state = 'waiting'
        num_sequential_pops = 0
        num_silence_blocks = 0

        fivePopsCallback()
    else if state = 'silence':
      if num_silence_blocks >= MAX_SILENCE_BLOCKS:
        #now we're just waiting
        state = 'waiting'
        num_silence_blocks = 0
        num_sequential_pops = 0

该代码根本不完整（可能有一两个错误），但说明了我的思路。它肯定比我想要的更复杂，这就是我要求建议的原因。

Answer 1

您可能想要计算最后P点的simple moving average，其中P~ = 4并将结果与原始输入数据一起绘制。

然后，您可以将平滑平均值的最大值用作弹出窗口。定义一个最大间隔，在该间隔中可以看到五个弹出窗口，这可能是你的后续内容。

调整P以获得最佳效果。

如果没有Python模块，我不会感到惊讶，但我还没看过。

Answer 2

对我而言，我最终得到的是一种天真的方法，它有一个持续的循环和一些变量来维持和过渡到新的状态。但是，在完成之后，我发现我应该探索热门词检测，因为连续5次点击基本上是一个热门词。他们有一个我必须寻找的模式。

无论如何，这是我的代码：

POP_MIN_MS = 50
POP_MAX_MS = 150

POP_GAP_MIN_MS = 50
POP_GAP_MAX_MS = 200

POP_BORDER_MIN_MS = 500

assert POP_BORDER_MIN_MS > POP_GAP_MAX_MS

POP_RMS_THRESHOLD_MIN = 100

FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100 # Sampling Rate -- frames per second
INPUT_BLOCK_TIME_MS = 50
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME_MS/1000)

POP_MIN_BLOCKS = POP_MIN_MS / INPUT_BLOCK_TIME_MS
POP_MAX_BLOCKS = POP_MAX_MS / INPUT_BLOCK_TIME_MS

POP_GAP_MIN_BLOCKS = POP_GAP_MIN_MS / INPUT_BLOCK_TIME_MS
POP_GAP_MAX_BLOCKS = POP_GAP_MAX_MS / INPUT_BLOCK_TIME_MS

POP_BORDER_MIN_BLOCKS = POP_BORDER_MIN_MS / INPUT_BLOCK_TIME_MS


def listen(self):
    pops = 0
    sequential_loud_blocks = 0
    sequential_notloud_blocks = 0

    stream = self.pa.open(
      format=FORMAT,
      channels=CHANNELS,
      rate=RATE,
      input=True,
      frames_per_buffer=INPUT_FRAMES_PER_BLOCK
    )

    states = {
      'PENDING': 1,
      'POPPING': 2,
      'ENDING': 3,
    }

    state = states['PENDING']

    while True:
      amp = audioop.rms(stream.read(INPUT_FRAMES_PER_BLOCK), 2)

      is_loud = (amp >= POP_RMS_THRESHOLD_MIN)

      if state == states['PENDING']:
        if is_loud:
          # Only switch to POPPING if it's been quiet for at least the border
          #   period. Otherwise stay in PENDING.
          if sequential_notloud_blocks >= POP_BORDER_MIN_BLOCKS:
            state = states['POPPING']
            sequential_loud_blocks = 1

          # If it's now loud then reset the # of notloud blocks
          sequential_notloud_blocks = 0
        else:
          sequential_notloud_blocks += 1

      elif state == states['POPPING']:

        if is_loud:
          sequential_loud_blocks += 1
          # TODO: Is this necessary?
          sequential_notloud_blocks = 0

          if sequential_loud_blocks > POP_MAX_BLOCKS:
            # it's been loud for too long; this isn't a pop
            state = states['PENDING']
            pops = 0
            #print "loud too long"
            # since it has been loud and remains loud then no reason to reset
            #   the notloud_blocks count

        else:
          # not loud
          if sequential_loud_blocks:
            # just transitioned from loud. was that a pop?
            # we know it wasn't too long, or we would have transitioned to
            #   PENDING during the pop
            if sequential_loud_blocks < POP_MIN_BLOCKS:
              # wasn't long enough
              # go to PENDING
              state = states['PENDING']
              pops = 0
              #print "not loud long enough"
            else:
              # just right
              pops += 1
              logging.debug("POP #%s", pops)

            sequential_loud_blocks = 0
            sequential_notloud_blocks += 1

          else:
            # it has been quiet. and it's still quiet
            sequential_notloud_blocks += 1

            if sequential_notloud_blocks > POP_GAP_MAX_BLOCKS:
              # it was quiet for too long
              # we're no longer popping, but we don't know if this is the
              #   border at the end
              state = states['ENDING']

      elif state == states['ENDING']:
        if is_loud:
          # a loud block before the required border gap. reset
          # since there wasn't a gap, this couldn't be a valid pop anyways
          #   so just go back to PENDING and let it monitor for the border
          sequential_loud_blocks = 1
          sequential_notloud_blocks = 0
          pops = 0

          state = states['PENDING']
        else:
          sequential_notloud_blocks += 1

          # Is the border time (500 ms right now) enough of a delay?
          if sequential_notloud_blocks >= POP_BORDER_MIN_BLOCKS:
            # that's a bingo!
            if pops == 5:

              stream.stop_stream()

              # assume that starting now the channel is not silent
              start_time = time.time()


              print ">>>>> 5 POPS"

              elapsed = time.time() - start_time

              #time.time() may return fractions of a second, which is ideal    
              stream.start_stream()

              # do whateve we need to do

            state = states['PENDING']
            pops = 0

需要一些正式的测试。我昨晚发现了一个问题，即在弹出一段时间之后它没有重置自己，然后是太长时间的安静。我的计划是重构，然后给它一个模拟RMS流（例如，（0,0,0,500,200,0,200,0，...）），并确保它检测（或不检测））适当的。

在数据流中查找一系列模式

2 个答案: