NumPy Fancy Indexing - 从不同渠道中获取不同的投资回报率

时间:2018-01-10 15:41:00

标签: python arrays numpy indexing

假设我们有以下numpy 4D数组(3x1x4x4):

import numpy as np
n, c, h, w = 3, 1, 4, 4

data = np.arange(n * c * h * w).reshape(n, c, h, w)

>>> array([[[[ 0,  1,  2,  3],
             [ 4,  5,  6,  7],
             [ 8,  9, 10, 11],
             [12, 13, 14, 15]]],

           [[[16, 17, 18, 19],
             [20, 21, 22, 23],
             [24, 25, 26, 27],
             [28, 29, 30, 31]]],

           [[[32, 33, 34, 35],
             [36, 37, 38, 39],
             [40, 41, 42, 43],
             [44, 45, 46, 47]]]])

现在我想在相同大小的不同位置裁剪每个n子阵列:

size = 2
locations = np.array([
    [0, 1],
    [1, 1],
    [0, 2]
])

执行此操作的缓慢方法如下:

crops = np.stack([d[:, y:y+size, x:x+size] 
     for d, (y,x) in zip(data, locations)])

>>> array([[[[ 1,  2],
             [ 5,  6]]],

           [[[21, 22],
             [25, 26]]],

           [[[34, 35],
             [38, 39]]]])

现在我正在寻找一种通过numpy的花式索引实现这一目标的方法。我已经花了几个小时才弄明白,如何解决这个问题。我是否忽略了解决这个问题的简单方法?那里有一些numpy索引专家,谁可以帮助我?

2 个答案:

答案 0 :(得分:1)

我们可以将this solution扩展到3D案例,方法是利用基于np.lib.stride_tricks.as_stridedsliding-windowed views进行有效的补丁提取,就像这样 -

from skimage.util.shape import view_as_windows

def get_patches(data, locations, size):
    # Get 2D sliding windows for each element off data
    w = view_as_windows(data, (1,1,size,size))

    # Use fancy/advanced indexing to select the required ones
    return w[np.arange(len(locations)), :, locations[:,0], locations[:,1]][:,:,0,0]

我们需要那些1,1作为view_as_windows的窗口参数,因为它期望窗口具有与输入数据的dims数量相同的元素数量。我们沿data的最后两个轴滑动,因此将前两个轴保持为1s,基本上不沿data的前两个轴滑动。

样本运行单通道和多通道数据 -

In [78]: n, c, h, w = 3, 1, 4, 4 # number of channels = 1
    ...: data = np.arange(n * c * h * w).reshape(n, c, h, w)
    ...: 
    ...: size = 2
    ...: locations = np.array([
    ...:     [0, 1],
    ...:     [1, 1],
    ...:     [0, 2]
    ...: ])
    ...: 
    ...: crops = np.stack([d[:, y:y+size, x:x+size] 
    ...:      for d, (y,x) in zip(data, locations)])

In [79]: print np.allclose(get_patches(data, locations, size), crops)
True

In [80]: n, c, h, w = 3, 5, 4, 4 # number of channels = 5
    ...: data = np.arange(n * c * h * w).reshape(n, c, h, w)
    ...: 
    ...: size = 2
    ...: locations = np.array([
    ...:     [0, 1],
    ...:     [1, 1],
    ...:     [0, 2]
    ...: ])
    ...: 
    ...: crops = np.stack([d[:, y:y+size, x:x+size] 
    ...:      for d, (y,x) in zip(data, locations)])

In [81]: print np.allclose(get_patches(data, locations, size), crops)
True

基准

其他方法 -

# Original soln
def stack(data, locations, size):
    crops = np.stack([d[:, y:y+size, x:x+size] 
         for d, (y,x) in zip(data, locations)])    
    return crops

# scholi's soln
def allocate_assign(data, locations, size):
    n, c, h, w = data.shape
    crops = np.zeros((n,c,size,size))
    for i, (y,x) in enumerate(locations):
        crops[i,0,:,:] = data[i,0,y:y+size,x:x+size]
    return crops

从评论中看,OP似乎对形状为(512,1,60,60)size12,24,48的数据感兴趣。因此,让我们使用函数设置相同的数据 -

# Setup data
def create_inputs(size):
    np.random.seed(0)
    n, c, h, w = 512, 1, 60, 60
    data = np.arange(n * c * h * w).reshape(n, c, h, w)
    locations = np.random.randint(0,3,(n,2))
    return data, locations, size

计时 -

In [186]: data, locations, size = create_inputs(size=12)

In [187]: %timeit stack(data, locations, size)
     ...: %timeit allocate_assign(data, locations, size)
     ...: %timeit get_patches(data, locations, size)
1000 loops, best of 3: 1.26 ms per loop
1000 loops, best of 3: 1.06 ms per loop
10000 loops, best of 3: 124 µs per loop

In [188]: data, locations, size = create_inputs(size=24)

In [189]: %timeit stack(data, locations, size)
     ...: %timeit allocate_assign(data, locations, size)
     ...: %timeit get_patches(data, locations, size)
1000 loops, best of 3: 1.66 ms per loop
1000 loops, best of 3: 1.55 ms per loop
1000 loops, best of 3: 470 µs per loop

In [190]: data, locations, size = create_inputs(size=48)

In [191]: %timeit stack(data, locations, size)
     ...: %timeit allocate_assign(data, locations, size)
     ...: %timeit get_patches(data, locations, size)
100 loops, best of 3: 2.8 ms per loop
100 loops, best of 3: 3.33 ms per loop
1000 loops, best of 3: 1.45 ms per loop

答案 1 :(得分:0)

堆叠很慢。由于大小已知,最好先分配裁剪的数组。

$name = $_POST['name'];
$sql = "SELECT * FROM db where name ='$name'";
if($result = mysqli_query($conn, $sql)){
    if(mysqli_num_rows($result) > 0){ 
        echo "<table>"; //tabellen
        echo "<table border=1";
            echo "<tr>";
                echo "<th>Email</th>";
                echo "<th>Name</th>";
                echo "<th>nbr</th>";
            echo "</tr>";
        while($row = mysqli_fetch_array($result)){ 
            echo "<tr>";
                echo "<td>" . $row['email'] . "</td>";
                echo "<td>" . $row['name'] . "</td>";
                echo "<td>" . $row['nbr'] . "</td>";

            echo "</tr>";
        }
        echo "</table>";
        mysqli_free_result($result);

Divakar的解决方案速度最快。我用%% timeit得到92.3μs 您的堆栈解决方案为35.4μs,我提供的示例为29.3μs。