Question

我有一个包含多个数据通道和一个触发通道的数据帧。触发通道为0或1.事件发生时为1。

我想检测事件的发生（trigger == 1）并在数据帧中采样数据通道，这样我才能在事件触发后获得指定的时间。

举一个具体的例子，考虑一下：

import numpy as np
import pandas as pd

np.random.seed(0)

# time in seconds
t = np.arange(10)*0.1

# data channels and trigger
d = dict(y=np.random.randn(10),
         z=np.random.randn(10),
         trigger=[0, 1, 0, 0, 0, 1, 0, 0, 0, 0])
df = pd.DataFrame(d, index=t)

所以df是：

     trigger         y         z
0.0        0  1.764052  0.144044
0.1        1  0.400157  1.454274
0.2        0  0.978738  0.761038
0.3        0  2.240893  0.121675
0.4        0  1.867558  0.443863
0.5        1 -0.977278  0.333674
0.6        0  0.950088  1.494079
0.7        0 -0.151357 -0.205158
0.8        0 -0.103219  0.313068
0.9        0  0.410599 -0.854096

假设我的时间窗口为0.2秒。然后，当触发器在时间0.1秒和时间0.5秒时为1时，我想在此触发器后提取y和z为0.2的时间窗口，并将它们放入具有尺寸的numpy 3D数组中（事件数量，时间窗口中的样本，频道数。）

在此示例中，两个触发事件的尺寸为（2,3,2），0.2秒时间窗口内的3个样本和2个通道（y，z）。

在pandas中有一种有效的方法吗？我现在能想到的唯一方法是遍历trigger == 1事件。

Answer 1

提取设置了触发器的行索引：

idx = np.where(df.trigger)[0]

将1D数组扩展为2D数组，显示要采样的所有索引（这里我们每个触发器使用3个样本）：

samples = np.arange(3) + idx[:,np.newaxis]

这是一个2x3阵列：

array([[1, 2, 3],
       [5, 6, 7]])

我们用它来获得最终结果：

out = df[['y', 'z']].values[samples]

对于每个触发器，这是y和z的2x3x2值数组乘以每个触发器的样本数。

使用触发器重新取样并重塑pandas数据帧

1 个答案: