Python:根据前一行的最后一个元素和后一行的第一个元素之间的差异组合数组行

时间:2017-08-23 15:36:51

标签: python arrays numpy

作为标题,我说我得到一个(n,2)numpy数组,记录一系列段的开始和结束索引,例如n = 6:

import numpy as np
# x records the (start, end) index pairs corresponding to six segments
x = np.array(([0,4],    # the 1st seg ranges from index 0 ~ 4
              [5,9],    # the 2nd seg ranges from index 5 ~ 9, etc.
              [10,13],
              [15,20],
              [23,30],
              [31,40]))

现在我想将这些段组合在一起,它们之间的间隔很小。例如,如果间隔不大于1,则合并连续的段,因此所需的输出将为:

y = np.array([0,13],    # Cuz the 1st seg's end is close to 2nd's start, 
                        # and 2nd seg's end is close to 3rd's start, so are combined.
             [15,20],   # The 4th seg is away from the prior and posterior segs,
                        # so it remains untouched.
             [23,40])   # The 5th and 6th segs are close, so are combined

这样输出段就会变成三个而不是六个。 任何建议将不胜感激!

2 个答案:

答案 0 :(得分:2)

如果我们能够假设这些片段是有序的并且没有一个完全包含在邻居中,那么你可以通过识别一个范围内的结束值与下一个范围的开始之间的差距超过你的位置来实现这一点标准:

#include "..." search starts here:
#include <...> search starts here:
 src
 ../src
 src/essentia
 ../src/essentia
 src/essentia/scheduler
 ../src/essentia/scheduler
 src/essentia/streaming
 ../src/essentia/streaming
 src/essentia/streaming/algorithms
 ../src/essentia/streaming/algorithms
 src/essentia/utils
 ../src/essentia/utils
 src/3rdparty
 ../src/3rdparty
 src/3rdparty/spline
 ../src/3rdparty/spline
 src/3rdparty/vamp-plugin-sdk-2.4
 ../src/3rdparty/vamp-plugin-sdk-2.4
 /usr/include/taglib
 /usr/include/qt4
 /usr/include/qt4/QtCore
 /usr/local/include/gaia2/
 /usr/include/c++/6
 /usr/include/x86_64-linux-gnu/c++/6
 /usr/include/c++/6/backward
 /usr/lib/gcc/x86_64-linux-gnu/6/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.

然后将这些碎片拼接在一起:

start = x[1:, 0]  # select columns, ignoring the beginning of the first range
end = x[:-1, 1]  # and the end of the final range
mask = start>end+1  # identify where consecutive rows have too great a gap

答案 1 :(得分:2)

这是一个NumPy矢量化解决方案 -

def merge_boundaries(x):
    mask = (x[1:,0] - x[:-1,1])!=1
    idx = np.flatnonzero(mask)
    start = np.r_[0,idx+1]
    stop = np.r_[idx, x.shape[0]-1]
    return np.c_[x[start,0], x[stop,1]]

示例运行 -

In [230]: x
Out[230]: 
array([[ 0,  4],
       [ 5,  9],
       [10, 13],
       [15, 20],
       [23, 30],
       [31, 40]])

In [231]: merge_boundaries(x)
Out[231]: 
array([[ 0, 13],
       [15, 20],
       [23, 40]])