在给定方向的3D数组中查找连接元素(运行)

时间:2017-07-20 09:10:22

标签: python numpy

我有一个3D数组,例如:

array = np.array([[[ 1.,  1.,  0.],
                   [ 0.,  0.,  0.],
                   [ 0.,  0.,  0.]],
                  [[1.,  0.,  0.],
                   [ 0.,  0.,  0.],
                   [ 0.,  0.,  0.]],
                  [[ 1.,  0.,  0.],
                   [ 0.,  0.,  0.],
                   [ 0.,  0.,  0.]]])

我需要针对不同方向找到每个可能尺寸的连接线。 例如,在方向(1,0,0),水平方向,输出应该是:

> **Runs of size 1: 2
> **Runs of size 2: 1
> **Runs of size 3: 0

在方向(0,1,0)上,垂直轴:

> **Runs of size 1: 4
> **Runs of size 2: 0
> **Runs of size 3: 0

在方向(0,0,1)上,横跨z轴:

> **Runs of size 1: 1
> **Runs of size 2: 0
> **Runs of size 3: 1

有没有一种有效的方法来实现它?

修改

我附上了我正在处理的解决方案:

dire = np.array(([1, 0, 0], [0, 1, 0], [0, 0, 1])) # Possible directions
results = np.zeros((array.shape[0], len(dire)))  # Maximum runs will be 3

# Find all indices
indices = np.argwhere(array == 1) 

# Loop over each direction                 
for idire, dire in enumerate(dire):
    results[0, idire] = np.count_nonzero(array) # Count all 1 (conection 0)
    indices_to_compare = indices    

# Now we compare the list of indices with a displaced list of indices
    for irun in range(1, array.shape[0]):   
        indices_displaced = indices + dire*irun             
        aset = set([tuple(x) for x in indices_displaced])
        bset = set([tuple(x) for x in indices_to_compare])
        indices_to_compare = (np.array([x for x in aset & bset]))
        results[irun, idire] = len(indices_to_compare)

    # Rest the counts off bigger runs to the smaller ones, for duplicates
    for indx in np.arange(array.shape[0]-2,-1,-1):
        results[indx, idire] -=            
        np.sum(results[np.arange(array.shape[0]-1,indx,-1), idire])


print(results) # Each column is a direction, each row is the number of bins.

> [[ 2.  4.  3.]
   [ 1.  0.  1.]
   [ 1.  0.  0.]]

到目前为止,这段代码不起作用,只针对方向(0,1,0),它没有任何连接,而且很简单。使用此表示法,预期输出应为:

> [[ 2.  4.  1.]
   [ 1.  0.  0.]
   [ 0.  0.  1.]]

数组的比较来自this link

编辑2:

我在解释坐标时遇到了错误,似乎对np.argwhere,(1,0,0)的结果的addint超过了z axist,所以预期的结果应该是:

> [[ 1.  4.  2.]
   [ 0.  0.  1.]
   [ 1.  0.  0.]]

3 个答案:

答案 0 :(得分:1)

这是一个沿轴向工作的矢量化解决方案:

def run_lengths(arr, axis):
    # convert to boolean, for speed - no need to store 0 and 1 as floats
    arr = arr.astype(bool)

    # move the axis in question to the start, to make the slicing below easy
    arr = np.moveaxis(arr, axis, 0)

    # find positions at the start and end of a run
    first = np.empty(arr.shape, dtype=bool)
    first[:1] = arr[:1]
    first[1:] = arr[1:] & ~arr[:-1]
    last = np.zeros(arr.shape, dtype=bool)
    last[-1:] = arr[-1]
    last[:-1] = arr[:-1] & ~arr[1:]

    # count the number in each run
    c = np.cumsum(arr, axis=0)
    lengths = c[last] - c[first]

    # group the runs by length. Note the above gives 0 for a run of length 1,
    # so result[i] is the number of runs of length (i + 1)
    return np.bincount(lengths, minlength=len(arr))

for i in range(3):
    print(run_lengths(array, axis=i))
[1 0 1]
[4 0 0]
[2 1 0]

答案 1 :(得分:1)

这是一个适用于任何方向和方向组合的解决方案。本质上,我们使用数组创建一个图形,其中True元素对应于节点,如果在任何允许的方向上有节点,则创建边缘。然后,我们计算连接的组件及其大小。使用networkx查找图表中的连接组件,可以通过pip轻松安装。

 import numpy as np
 import networkx

 def array_to_graph(bool_array, allowed_steps):
     """
     Arguments:
     ----------
     bool_array    -- boolean array
     allowed_steps -- list of allowed steps; e.g. given a 2D boolean array,
                      [(0, 1), (1, 1)] signifies that from coming from element
                      (i, j) only elements (i, j+1) and (i+1, j+1) are reachable

     Returns:
     --------
     g               -- networkx.DiGraph() instance
     position_to_idx -- dict mapping (i, j) position to node idx
     idx_to_position -- dict mapping node idx to (i, j) position
     """

     # ensure that bool_array is boolean
     assert bool_array.dtype == np.bool, "Input array has to be boolean!"

     # map array indices to node indices and vice versa
     node_idx = range(np.sum(bool_array))
     node_positions = zip(*np.where(bool_array))
     position_to_idx = dict(zip(node_positions, node_idx))

     # create graph
     g = networkx.DiGraph()
     for source in node_positions:
         for step in allowed_steps: # try to step in all allowed directions
             target = tuple(np.array(source) + np.array(step))
             if target in position_to_idx:
                 g.add_edge(position_to_idx[source], position_to_idx[target])

     idx_to_position = dict(zip(node_idx, node_positions))

     return g, idx_to_position, position_to_idx

 def get_connected_component_sizes(g):
     component_generator = networkx.connected_components(g)
     component_sizes = [len(component) for component in component_generator]
     return component_sizes

 def test():
     array = np.array([[[ 1.,  1.,  0.],
                        [ 0.,  0.,  0.],
                        [ 0.,  0.,  0.]],
                       [[1.,  0.,  0.],
                        [ 0.,  0.,  0.],
                        [ 0.,  0.,  0.]],
                       [[ 1.,  0.,  0.],
                        [ 0.,  0.,  0.],
                        [ 0.,  0.,  0.]]], dtype=np.bool)

     directions = [(1,0,0), (0,1,0), (0,0,1)]

     for direction in directions:
         graph, _, _ = array_to_graph(array, [direction])
         sizes = get_connected_component_sizes(graph.to_undirected())
         counts = np.bincount(sizes)

         print direction
         for ii, c in enumerate(counts):
             if ii > 1:
                 print "Runs of size {}: {}".format(ii, c)

     return

注意:"运行"大小为1不计算(正确)但我认为你对它们不感兴趣。否则,您可以通过计算特定方向上True元素的总数并减去运行次数和运行长度来轻松地计算它们。

答案 2 :(得分:0)

我添加了我正在使用的代码的变体,现在它似乎可行:

import numpy as np


array = np.array([[[ 1.,  1.,  0.],
                   [ 0.,  0.,  0.],
                   [ 0.,  0.,  0.]],
                  [[1.,  0.,  0.],
                   [ 0.,  0.,  0.],
                   [ 0.,  0.,  0.]],
                  [[ 1.,  0.,  0.],
                   [ 0.,  0.,  0.],
                   [ 0.,  0.,  0.]]])

dire = np.array(([1, 0, 0], [0, 1, 0], [0, 0, 1])) # Possible directions
results = np.zeros((array.shape[0], len(dire)))  # Maximum runs will be 3

# Find all indices
indices = np.argwhere(array == 1) 

# Loop over each direction                 
for idire, dire in enumerate(dire):
    results[0, idire] = len(indices) # Count all 1 (conection 0)
    indices_to_compare = indices 

    for irun in range(1, array.shape[0]):
        print('dir:',dire, 'run', irun)
        indices_displaced = indices + dire*irun             
        aset = set([tuple(x) for x in indices_displaced])
        bset = set([tuple(x) for x in indices_to_compare])
        indices_to_compare = (np.array([x for x in aset & bset]))
        results[irun, idire] = len(indices_to_compare)


    for iindx in np.arange(array.shape[0]-2,-1,-1):
        for jindx in np.arange(iindx+1, array.shape[0]):
            print(iindx,jindx,  results[jindx, idire])
            results[iindx, idire] -=  results[jindx, idire] *(jindx-iindx+1)         


print('\n',results)