我正在尝试读取带有XYZ数据的CSV文件,但是使用Python Natgrid进行网格化时会导致错误:two input triples have the same x/y coordinates
。这是我的数组:
np.array([[41.540588, -100.348335, 0.052785],
[41.540588, -100.348335, 0.053798],
[42.540588, -102.348335, 0.021798],
[42.540588, -102.348335, 0.022798],
[43.540588, -103.348335, 0.031798]])
我想删除XY重复项并获得最大的Z值。基于上面的示例,我要删除此数组的所有最小值:
np.array([[41.540588, -100.348335, 0.053798],
[42.540588, -102.348335, 0.022798],
[43.540588, -103.348335, 0.031798]])
我尝试使用np.unique
,但到目前为止我还没有碰到任何运气,因为它不适用于行(仅适用于列)。
答案 0 :(得分:0)
如果您可以使用pandas
,则可以利用groupby
和max
>>> pandas.DataFrame(arr).groupby([0,1], as_index=False).max().values
array([[ 4.15405880e+01, -1.00348335e+02, 5.37980000e-02],
[ 4.25405880e+01, -1.02348335e+02, 2.27980000e-02],
[ 4.35405880e+01, -1.03348335e+02, 3.17980000e-02]])
答案 1 :(得分:0)
您可以通过排序和删除重复项来使用熊猫:
import pandas as pd
df = pd.DataFrame(arr)
res = df.sort_values(2, ascending=False)\
.drop_duplicates([0, 1])\
.sort_values(0).values
print(res)
array([[ 4.15405880e+01, -1.00348335e+02, 5.37980000e-02],
[ 4.25405880e+01, -1.02348335e+02, 2.27980000e-02],
[ 4.35405880e+01, -1.03348335e+02, 3.17980000e-02]])
答案 2 :(得分:0)
这里是一种numpy
的方式,首先按Z
进行排序,然后找到每个唯一的X
和Y
对中的第一个,并建立索引:
a = np.array([[41.540588, -100.348335, 0.052785],
[41.540588, -100.348335, 0.053798],
[42.540588, -102.348335, 0.021798],
[42.540588, -102.348335, 0.022798],
[43.540588, -103.348335, 0.031798]])
# sort by Z
b = a[np.argsort(a[:,2])[::-1]]
# get first index for each unique x,y pair
u = np.unique(b[:,:2],return_index=True,axis=0)[1]
# index
c = b[u]
>>> c
array([[ 4.15405880e+01, -1.00348335e+02, 5.37980000e-02],
[ 4.25405880e+01, -1.02348335e+02, 2.27980000e-02],
[ 4.35405880e+01, -1.03348335e+02, 3.17980000e-02]])