Question

我们想要选择一维数组v_data的一些元素。我们需要进行的处理需要循环v_data的子向量。

现在我使用像Gosper's hack这样的字节逻辑来创建一个整数n_mask，其二进制表示形式对应于我想要的v_data索引。 n_mask可以通过以下方法转换为二进制向量：

def num2bv(num, n_len):
    """Convert a number to a binary vector of some length"""
    return [bool((2**ii & num)//(2**ii)) for ii in reversed(range(0, n_len))]

设置bv_mask = num2bv(n_mask, len(v_data))，可以通过运行v_data[bv_using]

来恢复子向量

这是一个不好的方法吗？特别是我担心：

使用二进制向量进行索引在实践中会很慢
num2bv在实践中会很慢
能够将此技术用于任何长度的向量取决于Python的任意精度整数，这可能很慢或不可移植

这些有效问题是什么？

Answer 1

根据矢量的长度，使用itertools.combinations可能要快得多：

In [2] v = np.array(range(15))

In [3]: %time x = [v[num2bv(i,15)] for i in range(2**15)]
CPU times: user 498 ms, sys: 5.83 ms, total: 504 ms
Wall time: 506 ms

In [4]: %time x = [c for i in range(15) for c in combinations(v,i)]
CPU times: user 5.67 ms, sys: 1.53 ms, total: 7.2 ms
Wall time: 6.91 ms

Answer 2

num2bv的数组版本是：

def foo(num, n_len):
    c = np.int64(2)**np.arange(n_len)[::-1]
    return (np.bitwise_and(c,num)//c).astype(bool)

In [713]: N=16
In [714]: foo(2**(N//2)-1,N)
Out[714]: 
array([False, False, False, False, False, False, False, False,  True,
        True,  True,  True,  True,  True,  True,  True], dtype=bool)
In [715]: np.array(num2bv(2**(N//2)-1,N))
Out[715]: 
array([False, False, False, False, False, False, False, False,  True,
        True,  True,  True,  True,  True,  True,  True], dtype=bool)

In [716]: N=32
In [717]: timeit np.array(num2bv(2**(N//2)-1,N))
10000 loops, best of 3: 39.7 µs per loop
In [718]: timeit foo(2**(N//2)-1,N)
The slowest run took 4.72 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 15.7 µs per loop

但对于大N，数组版本开始遇到2 ** i的整数表示问题。

对于较大的N，num2bv时间主导索引，例如N=1024

In [734]: timeit num2bv(2**(N//2)-1,N)
100 loops, best of 3: 4.34 ms per loop
In [735]: timeit np.nonzero(num2bv(2**(N//2)-1,N))
100 loops, best of 3: 4.4 ms per loop
In [736]: A=np.ones(N)
In [738]: timeit A[num2bv(2**(N//2)-1,N)]
100 loops, best of 3: 4.39 ms per loop

np.binary_repr返回一个字符串表示。它使用Python bin。创建掩码的一种方法是使用list将其拆分，然后让np.array将其转换为布尔数组

In [846]: np.binary_repr(100,16)
Out[846]: '0000000001100100'
In [848]: np.array(list(np.binary_repr(100,16)),bool)
Out[848]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True], dtype=bool)

它不如to_numpy_mask_2快，但仍比num2bv有了很大改进。并且不受我之前foo的尺寸限制。

In [842]: timeit np.array(num2bv(100,120))
10000 loops, best of 3: 166 µs per loop
In [843]: timeit to_numpy_mask_2(100,120)
The slowest run took 5.07 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.6 µs per loop
In [850]: timeit np.array(list(np.binary_repr(100,120)),bool)
100000 loops, best of 3: 18 µs per loop

Answer 3

将.to_bytes整数方法与np.unpackbits相结合的速度相当快。在不到一毫秒的时间内转换1,000,000位字：

import numpy as np
import random
from timeit import timeit

# a slower approach using builtin 'bin' function
def to_numpy_mask_1(n, bits = 120):
    return (np.frombuffer(bin(n + 2**bits)[-bits:].encode('utf8'),
                          dtype=np.uint8)-48).view(bool)

# the real thing based on '.to_bytes'
def to_numpy_mask_2(n, bits = 120):
    return np.unpackbits(np.frombuffer(n.to_bytes((bits-1)//8 + 1, 'big'),
                                       dtype=np.uint8)).view(bool)[-bits:]

# check

base = 2**np.arange(120)[::-1].astype(object)
n = random.randint(0, 2**120)
print(n, base[to_numpy_mask_1(n)].sum(), base[to_numpy_mask_2(n)].sum())

# benchmark

# translation only, no indexing
N = 10**6
n = random.randint(0, 2**N)
print('{:8.6g} secs'.format(timeit(lambda: to_numpy_mask_1(n, bits = N),
                                   number=10)/10))
print('{:8.6g} secs'.format(timeit(lambda: to_numpy_mask_2(n, bits = N),
                                   number=10)/10))

# including indexing
data = np.random.randn(N)
print('{:8.6g} secs'.format(timeit(lambda: data[to_numpy_mask_1(n, bits = N)],
                                   number=10)/10))
print('{:8.6g} secs'.format(timeit(lambda: data[to_numpy_mask_2(n, bits = N)],
                                   number=10)/10))

示例输出：

# 303734588154968662776606530859339928 303734588154968662776606530859339928 303734588154968662776606530859339928
# 0.00622677 secs
# 0.00051558 secs
# 0.0121338 secs
# 0.00697929 secs

使用掩码在Python中选择子向量的最佳方法

3 个答案: