Question

我有一个大数据框，其中包含标准化和按比例缩放的数据，其范围应在0-1之间。但是，当我打印其最大值时，我得到-1.000000002。 describe()方法不显示此值。因此，我试图确定问题并希望打印出有问题的行。我遇到的所有其他答案都在谈论以某一列的最大值打印一行。如何打印包含整个数据帧最大值的行？感谢您的帮助！

test = pd.DataFrame({'att1'  : [0.1, 0.001, 0.0001,
                            1, 2,
                            0.5, 0, -1, -2],
                   'att2':[0.01, 0.0001, 0.00001,
                            1.1, 2.2,
                            2.37, 0, -1.5, -2.5]})
test.max().max()
Out: 2.37000

理想的结果：

    att1    att2
5   0.5     2.37

UPD： 我更新了测试数据框，因为它引起了混乱（我的错！）。我需要打印一行，其中包含整个数据帧的最大值。

Answer 1

我在idxmax之后使用stack

test.iloc[[test.stack().idxmax()[0]]]
Out[154]: 
   att1  att2
5   2.3  2.37

Answer 2

修改：
经过OP的进一步解释，我认为将Plugin org.apache.maven.plugins:maven-clean-plugin:2.5 or one of its dependencies could not be resolved: Cannot access central (https://repo.maven.apache.org/maven2) in offline mode and the artifact org.apache.maven.plugins:maven-clean-plugin:jar:2.5 has not been downloaded from it before.数组与values进行比较比较灵活，如下所示：

values.max()

它返回数据帧的max_value行。如果att1_max与att2_max相同，但在不同的行上，则返回两行。在这种情况下，如果最好使用单行，请向其添加test[test.values == test.values.max()]。

att1_max和att2_max在同一行：

head(1)

在不同行上的

att1_max和att2_max：

Out[660]:
     att1     att2
0  0.1000  0.01000
1  0.0010  0.00010
2  0.0001  0.00001
3  1.0000  1.10000
4  2.0000  2.20000
5  2.3000  2.37000
6  0.0000  0.00000
7 -1.0000 -1.50000
8 -2.0000 -2.50000

In [661]: test[test.values == test.values.max()]
Out[661]:
   att1  att2
5   2.3  2.37

att1_max与att2_max相同，但是在不同的行上（这种情况Out[664]: att1 att2 0 0.1000 0.01000 1 0.0010 0.00010 2 0.0001 0.00001 3 1.0000 1.10000 4 2.0000 2.20000 5 2.3000 1.37000 6 0.0000 0.00000 7 -1.0000 -1.50000 8 -2.0000 -2.50000 In [665]: test[test.values == test.values.max()] Out[665]: att1 att2 5 2.3 1.37仅返回1行，而返回两行）

stack

注意：在最后一种情况下，如果需要返回单身，只需添加Out[668]: att1 att2 0 0.1000 0.01000 1 25.0500 0.00010 2 0.0001 0.00001 3 1.0000 1.10000 4 2.0000 2.20000 5 2.3000 1.37000 6 0.0000 0.00000 7 -1.0000 25.05000 8 -2.0000 -2.50000 In [669]: test[test.values == test.values.max()] Out[669]: att1 att2 1 25.05 0.0001 7 -1.00 25.0500

head(1)

注2：如果att1_max和att2_max相同且在同一行上，则该行将显示两次。在这种情况下，请使用In [670]: test[test.values == test.values.max()].head(1) Out[670]: att1 att2 1 25.05 0.0001进行处理。

原始：

@ Wen-Ben的答案很好，但我认为在此使用drop_duplicates()是不必要的。我更喜欢stack和idxmax：

drop_duplicates

或

test.iloc[test.idxmax()].drop_duplicates()

att1_max和att2_max在同一行：

test.loc[test.idxmax().drop_duplicates()]

在不同行上的

att1_max和att2_max：

In [510]: test.iloc[test.idxmax()].drop_duplicates()
Out[510]:
   att1  att2
5   2.3  2.37

因此，att1_max和att2_max在同一行，返回精确的1行。 att1_max和att2_max位于不同的行，返回2行，其中att1_max和att2_max存在。

Answer 3

让我们使用np.where，它返回行和列的索引：

r, _ = np.where(test.values == np.max(test.values))
test.iloc[r]

输出：

   att1  att2
5   2.3  2.37

熊猫：在所有列中找到最大值并打印此行

3 个答案: