假设我有一个像下面这样的Panda DataFrame
。这些值基于距离矩阵。
A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214),
(0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769),
(0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946),
(0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514),
(0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0)
])
输出:
Out[65]:
0 1 2 3 4
0 1.000000 0.800000 0.670820 0.676123 0.730297
1 0.800000 1.000000 0.670820 0.845154 0.912871
2 0.670820 0.670820 1.000000 0.566947 0.612372
3 0.676123 0.845154 0.566947 1.000000 0.925820
4 0.730297 0.912871 0.612372 0.925820 1.000000
我只想要上三角。
c2 = A.copy()
c2.values[np.tril_indices_from(c2)] = np.nan
输出:
Out[67]:
0 1 2 3 4
0 NaN 0.8 0.67082 0.676123 0.730297
1 NaN NaN 0.67082 0.845154 0.912871
2 NaN NaN NaN 0.566947 0.612372
3 NaN NaN NaN NaN 0.925820
4 NaN NaN NaN NaN NaN
现在我想根据一些标准获得列和行索引对。
例如:获取值大于0.8的列和行索引。为此,输出应为[1,3],[1,4],[3,4]
。对此有何帮助?
答案 0 :(得分:3)
你可以使用numpy的argwhere:
In [11]: np.argwhere(c2 > 0.8)
Out[11]:
array([[1, 3],
[1, 4],
[3, 4]])
要获取索引/列(而不是它们的整数位置),可以使用列表推导:
[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)]