标记基于唯一性的元素

时间:2015-12-17 19:54:41

标签: python arrays numpy

我有一个像这样的排序数组 -

[[ 0    ]
 [ 0    ]
 [ 0    ]
 [ 3    ]
 [ 4    ]
 [ 15   ]
 [ 17   ]
 [ 87   ]
 [ 87   ]
 [ 87   ]
 [ 92   ]
 [ 180  ]
 [ 180  ]
 [ 215  ]
 [ 602  ]
 [ 1254 ]]

我想根据数组元素的唯一性来标记它们。因此,重复值应采用相同的标签。起始重复的0元素将标记为0,其余应为连续数字。稍后在数组中,有三个87值,它们应标记为与5相同,然后两个180值应标记为7&#39 ;。我要找的最终输出是 -

[[    0       0  ]
 [    0       0  ]
 [    0       0  ]
 [    3       1  ]
 [    4       2  ]
 [   15       3  ]
 [    17      4  ]
 [    87      5  ]
 [    87      5  ]
 [    87      5  ]
 [    92      6  ]
 [   180      7  ]
 [   180      7  ]
 [   215      8  ]
 [   602      9  ]
 [  1254     10  ]]

2 个答案:

答案 0 :(得分:1)

您希望根据np.unique中使用的可选参数return_inverse可以获取的元素之间的唯一性来查找ID,如此 -

_,idx = np.unique(A,return_inverse=True)

示例运行:

1)输入数组 -

In [86]: A
Out[86]: 
array([[   0],
       [   0],
       [   0],
       [   3],
       [   4],
       [  15],
       [  17],
       [  87],
       [  87],
       [  87],
       [  92],
       [ 180],
       [ 180],
       [ 215],
       [ 602],
       [1254]])

2)获取所有元素的唯一ID并将其与输入元素一起显示 -

In [87]: _,idx = np.unique(A,return_inverse=True)

In [88]: np.column_stack((A,idx))
Out[88]: 
array([[   0,    0],
       [   0,    0],
       [   0,    0],
       [   3,    1],
       [   4,    2],
       [  15,    3],
       [  17,    4],
       [  87,    5],
       [  87,    5],
       [  87,    5],
       [  92,    6],
       [ 180,    7],
       [ 180,    7],
       [ 215,    8],
       [ 602,    9],
       [1254,   10]])

答案 1 :(得分:0)

你可以循环遍历数组,当当前元素与前一个元素不同时,递增你的计数器并将计数附加到当前元素,如此

count = 0
prev = l[0][0]
for i in l:
    if i[0] != prev:
        prev = i[0]
        count+=1
    i.append(count)

print np.array(l)


[[   0    0]
 [   0    0]
 [   0    0]
 [   3    1]
 [   4    2]
 [  15    3]
 [  17    4]
 [  87    5]
 [  87    5]
 [  87    5]
 [  92    6]
 [ 180    7]
 [ 180    7]
 [ 215    8]
 [ 602    9]
 [1254   10]]