Question

我的数组很长，我正在尝试以尽可能高效的方式执行以下操作：

对于列表中每个连续递增的块，我必须颠倒其顺序。

因此，对于以下数组：

a = np.array([1,5,7,3,2,5,4,45,1,5,10,12])

我想获得：

array([7,5,1,3,5,2,45,4,12,10,5,1])

我想知道是否可以使用numpy将其向量化？

在上一个问题this中，我已经有了一些答案，但是结果虽然有很大改进，但仍然有些慢。

Answer 1

你可以用熊猫吗？

import pandas as pd
a = [1,5,7,3,2,5,4,45,1,5,10,12]
aa = pd.Series(a)
aa.groupby(aa.diff().bfill().lt(0).cumsum()).apply(lambda x: x[::-1]).tolist()

输出：

[7, 5, 1, 3, 5, 2, 45, 4, 12, 10, 5, 1]

Answer 2

其他没有依赖项的选项：

array = [1,5,7,3,2,5,4,45,1,5,10,12]

res, memo = [], []
for e in array:
  if len(memo) == 0 or e > memo[-1]: memo.append(e)
  else:
    res.extend(reversed(memo))
    memo = [e]
res.extend(reversed(memo))

res # => [7, 5, 1, 3, 5, 2, 45, 4, 12, 10, 5, 1]

修改后的版本要快一些：

def reverse_if_chunck_increases(array):
  res, memo, last_memo = [], [], None
  for e in array:
    if not last_memo or e > last_memo:
      last_memo = e
      memo.append(e)
    else:
      res.extend(memo[::-1])
      last_memo, memo = e, [e]
  res.extend(memo[::-1])
  return res

print(reverse_if_chunck_increases(array) == [7, 5, 1, 3, 5, 2, 45, 4, 12, 10, 5, 1])
#=> True

接受答案后进行编辑（也许有用）。

我能够如此轻松地获得结果，并且显然可以更快地在Ruby中进行编码，如下所示：

array.chunk_while { |x, y| x < y }.flat_map{ |chunk| chunk.reverse }

因此，我想知道为什么没有像itertool这样的chunk_while。然后，我尝试使用yield编写类似的代码：

def reverse_if_chunk_increases(array):
  i, x, size, res = 0, 0, len(array), []
  while i < size-1:
    if array[i] > array[i+1]:
      yield array[x:i+1][::-1]
      x = i +1
    i += 1
  yield array[x:size][::-1]

执行速度非常快，但它会返回生成器进行迭代，而不是返回列表：

chunks = reverse_if_chunk_increases(array)
for chunk in chunks:
  print(chunk)
# [7, 5, 1]
# [3]
# [5, 2]
# [45, 4]
# [12, 10, 5, 1]

它可以转换为列表，这是最慢的过程。请注意，生成器只能调用一次。删除[::-1]会得到与Ruby枚举器/生成器chunk_while类似的结果。

Answer 3

这怎么样？似乎是更快的afaik，但不知道有多快就很难确定

all_l = []
sub_l = []
for i in a:
    if sub_l:
        if sub_l[0] > i:
            all_l.extend(sub_l)
            sub_l = [i]
        else:
            sub_l.insert(0, i)
    else:
        sub_l = [i]
all_l.extend(sub_l)

Answer 4

我认为您不会比使用纯python循环快得多。

例如，这是一个numpy + itertools解决方案：

import numpy as np
from itertools import chain, groupby
from operator import itemgetter

def reverse_increasing_sequences_numpy(a):
    idx = (np.diff(np.concatenate([[a[0]], a]))<0).cumsum()
    return list(
        chain.from_iterable(
            (reversed([x[0] for x in g]) for v, g in groupby(zip(a, idx), itemgetter(1)))
        )
    )

print(reverse_increasing_sequences(a))
#[7, 5, 1, 3, 5, 2, 45, 4, 12, 10, 5, 1]

但查看速度测试结果：

b = np.random.choice(10, 100000)

%%timeit
reverse_increasing_sequences_numpy(b)
#47.1 ms ± 778 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
reverse_increasing_sequences_iGian(b)
#40.3 ms ± 1.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%%timeit
reverse_increasing_sequences_hchw(b)
#26.1 ms ± 1.35 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

@hchw的solution运行速度比我的numpy版本快2倍。

逆序连续数字

4 个答案: