我想删除数组下面的一些“重复”项:
arr = array([[1, 2, 3, 0, 1],
[0, 0, 0, 1, 0]])
在上面的数组中,arr[:, 0]
,arr[:, 3]
和arr[:, 4]
应视为重复项。最后,我想要这个结果:
arr = array([[1, 2, 3],
[0, 0, 0]])
答案 0 :(得分:0)
如果您可以使用熊猫,则可以将np.sort
与pd.DataFrame.duplicated
结合使用来创建布尔索引器:
import numpy as np
import pandas as pd
arr = np.array([[1, 2, 3, 0, 1],
[0, 0, 0, 1, 0]])
arr_dups = pd.DataFrame(np.sort(arr.T)).duplicated().values
res = arr[:, ~arr_dups]
print(res)
array([[1, 2, 3],
[0, 0, 0]])
答案 1 :(得分:0)
您可以这样做:
import numpy as np
arr = np.array([[1, 2, 3, 0, 1],
[0, 0, 0, 1, 0]])
arr_sorted = arr.copy() # copy the array so your input is not sorted
arr_sorted.sort(0) # sort row such as duplicates are in the same order
_, indices = np.unique(arr_sorted, axis=1, return_index=True) # find unique indices
arr_unique = arr[:,indices] # select indices
print(arr_unique)
# [[1 2 3]
# [0 0 0]]