我有一个庞大的3D数组要处理。我想以以下方式重新标记元素
import numpy as np
given_array = np.array([1, 1, 1, 3, 3, 5, 5, 5, 8, 8, 8, 8, 8, 23, 23, 23])
required_array = np.array([0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4])
我知道relabel_sequential
中有一种skimage.segmentation
方法,但是出于我的目的,它很慢。任何想法的快速方式将不胜感激。
答案 0 :(得分:2)
最快的方法应该是编写一个特定的numba函数,以适合您的需要。
from numba import njit
import numpy as np
@njit()
def relabel(array):
i = 0
n = -1
previous = 0
while i < len(array):
if previous != array[i]:
previous = array[i]
n += 1
array[i] = n
i += 1
given_array = np.array([1, 1, 1, 3, 3, 5, 5, 5, 8, 8, 8, 8, 8, 23, 23, 23])
relabel(given_array)
given_array
array([0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4])
此示例对输入进行了很多假设,即对数组进行排序,第一个数字为正,它是一维形状,要覆盖数组。
答案 1 :(得分:1)
尝试一下,看看它是否足够快。将inverse
返回的numpy.unique
与参数return_inverse=True
一起使用:
In [52]: given_array = np.array([1, 1, 1, 3, 3, 5, 5, 5, 8, 8, 8, 8, 8, 23, 23, 23])
In [53]: u, inv = np.unique(given_array, return_inverse=True)
In [54]: inv
Out[54]: array([0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4])
答案 2 :(得分:1)
如果给定数组未排序,则比排序数组要快:
from numba import njit
import numpy as np
@njit()
def relabel_fast(array, count):
i = 0
while i < len(array):
data = array[i]
count[data] += 1
i += 1
a = 1 # Position in count
b = 0 # Position in array
c = 0 # The current output number
while a < len(count):
d = 0 # The number of 'c' to output
if count[a] > 0:
while d < count[a]:
array[b] = c
b += 1
d += 1
c += 1
a += 1
def relabel(given_array):
# Arrays cannot be created within Numba, so create the count array before calling the Numba function
count = np.zeros(np.max(given_array) + 1, dtype=int)
relabel_fast(given_array, count)
#given_array = np.array([1, 1, 1, 3, 3, 5, 5, 5, 8, 8, 8, 8, 8, 23, 23, 23])
given_array = np.array([1, 23, 1, 3, 8, 3, 5, 5, 8, 8, 8, 5, 8, 23, 23, 1])
relabel(given_array)
given_array
array([0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4])