Question

我正在尝试读取带有XYZ数据的CSV文件，但是使用Python Natgrid进行网格化时会导致错误：two input triples have the same x/y coordinates。这是我的数组：

np.array([[41.540588, -100.348335, 0.052785],
   [41.540588, -100.348335, 0.053798],
   [42.540588, -102.348335, 0.021798],
   [42.540588, -102.348335, 0.022798],
   [43.540588, -103.348335, 0.031798]])

我想删除XY重复项并获得最大的Z值。基于上面的示例，我要删除此数组的所有最小值：

np.array([[41.540588, -100.348335, 0.053798],
   [42.540588, -102.348335, 0.022798],
   [43.540588, -103.348335, 0.031798]])

我尝试使用np.unique，但到目前为止我还没有碰到任何运气，因为它不适用于行（仅适用于列）。

Answer 1

如果您可以使用pandas，则可以利用groupby和max

>>> pandas.DataFrame(arr).groupby([0,1], as_index=False).max().values

array([[ 4.15405880e+01, -1.00348335e+02,  5.37980000e-02],
       [ 4.25405880e+01, -1.02348335e+02,  2.27980000e-02],
       [ 4.35405880e+01, -1.03348335e+02,  3.17980000e-02]])

Answer 2

您可以通过排序和删除重复项来使用熊猫：

import pandas as pd

df = pd.DataFrame(arr)

res = df.sort_values(2, ascending=False)\
        .drop_duplicates([0, 1])\
        .sort_values(0).values

print(res)

array([[  4.15405880e+01,  -1.00348335e+02,   5.37980000e-02],
       [  4.25405880e+01,  -1.02348335e+02,   2.27980000e-02],
       [  4.35405880e+01,  -1.03348335e+02,   3.17980000e-02]])

Answer 3

这里是一种numpy的方式，首先按Z进行排序，然后找到每个唯一的X和Y对中的第一个，并建立索引：

a = np.array([[41.540588, -100.348335, 0.052785],
   [41.540588, -100.348335, 0.053798],
   [42.540588, -102.348335, 0.021798],
   [42.540588, -102.348335, 0.022798],
   [43.540588, -103.348335, 0.031798]])

# sort by Z
b = a[np.argsort(a[:,2])[::-1]]
# get first index for each unique x,y pair
u = np.unique(b[:,:2],return_index=True,axis=0)[1]
# index
c = b[u]
>>> c
array([[ 4.15405880e+01, -1.00348335e+02,  5.37980000e-02],
       [ 4.25405880e+01, -1.02348335e+02,  2.27980000e-02],
       [ 4.35405880e+01, -1.03348335e+02,  3.17980000e-02]])

numpy根据XYZ获得最大值

3 个答案: