Question

我有一个列表列表（2000x1000），但以这个列表为例（10x3）：

import pandas as pd
X = pd.read_pickle(¨./df.pkl¨)

在此示例中，每个列表对应于每个瞬间的3个测量值：

l = [[8, 7, 6], [5, 3, 1], [4, 5, 9], [1, 5, 1], [3, 5, 7], [8, 2, 5], [1, 9, 2], [8, 7, 6], [9, 9, 9], [4, 5, 9]]-> [8,7,6]

t0-> [5,3,1] 等等。

我想用4个瞬间的窗口来比较测量结果的位置，并取其最大值在峰对峰的99％处。

示例

让我们考虑第一个窗口：

t1

使用这3个值[8, 7, 6], [5, 3, 1], [4, 5, 9], [1, 5, 1] : [8,5,4,1] -> peak to peak: 8-1=7 [7,3,5,5] -> ptp=4 [6,1,9,1] -> ptp=8，我想取99％的最大值，在这种情况下为[7,6,8]

第二个窗口：

最大99％百分数-> [5, 3, 1], [4, 5, 9], [1, 5, 1], [3, 5, 7]: [5,4,1,3] -> ptp=4 [3,5,5,5] -> ptp=2 [1,9,1,7] -> ptp=8 在对所有大小为4的窗口执行此操作之后，我要使用这些值列出一个列表。

我的代码是以下代码，但是很慢。有没有一种快速的方法来实现这一点？

注意：我不能使用熊猫，并且Numpy版本应为<= 1.6

输出：

num_meas = 4
m = []
for index, i in enumerate(l):
    if index < len(l) - num_meas + 1:
        p = []
        for j in range(len(i)):
            t = []
            for k in range(num_meas):
                t.append(l[index + k][j])
            t = [x for x in t if ~np.isnan(x)]
            try:
                a = np.ptp(t)
            except ValueError:
                a = 0
            p.append(a)
        perce = np.percentile(p, 99)
        p = max([el for el in p if el < perce])
        m.append(p)
print m

Answer 1

请检查以下代码是否适用于NumPy 1.6：

import numpy as np

l = [[8, 7, 6], [5, 3, 1], [4, 5, 9], [1, 5, 1], [3, 5, 7], [8, 2, 5],
     [1, 9, 2], [8, 7, 6], [9, 9, 9], [4, 5, 9]]

l = np.array(l)

# range matrix
mat_ptp = np.zeros((l.shape[0]-3, l.shape[1]))

for i in range(l.shape[0]-3):
    l[i:i+4].ptp(axis=0, out=mat_ptp[i])

percentiles = np.percentile(mat_ptp, 99, axis=1)
greater_pos = np.greater_equal(mat_ptp, percentiles.reshape(-1, 1))
mat_ptp[greater_pos] = -np.inf

result = np.max(mat_ptp, axis=1)

为提高性能，您可以尝试使用numpy尽可能地保护您的操作。它可能比使用for循环和append函数要快得多。

编辑

对不起，我没有注意到您希望所选元素严格小于百分位数。这是正确的版本。

BENCHMARK

只是为了验证有关性能的问题，这是以下结果：

l = np.random.randint(0, 100, size=(200, 100))

使用timeit运行100次：

OP code: 0.5197743272900698 ms in average
Code above: 0.0021439407201251015 in average

列表列表窗口中的最大99％

示例

注意：我不能使用熊猫，并且Numpy版本应为<= 1.6

1 个答案: