Question

这是示例数据框

ID,IS,Val1,Val2,Val3
1,100,11,9,1
2,101,3,15,16
3,99,10,18,3
1,97,29,25,26

我也在使用idxmin计算每行的最小值，当我找到最小值时，我想检查对应于该列的最小值是否小于某个数字，如果是，那么我想包含其他内容我要删除它。这就是我在stack overflow的帮助下所做的事情。

df1 = df.set_index('ID').iloc[:,1:].idxmin(axis=1).reset_index(name= 'New')

df2 = df1.loc[34 > df.iloc[:, 1:].min(1)]

我得到了这个结果

ID   New
 1  Val3
 2  Val1
 3  Val3
 1  Val2

使用此代码时，我也得到了相同的结果

df2 = df1.loc[34 > df.iloc[:, 3:].min(1)]＃在这段代码中，我从Val2开始我的专栏但它仍然给出相同的结果（包括Val1）

ID   New
 1  Val3
 2  Val1
 3  Val3
 1  Val2

即使我从第三列中进行选择，为什么也得到相同的结果？这行代码到底在做什么？ df1.loc [34> df.iloc [:, 1 ::。min（1）]

Answer 1

两个布尔条件都对每一行都返回true，这就是为什么结果相同

34 > df.iloc[:, 3:].min(1)
Out[202]: 
0    True
1    True
2    True
3    True
dtype: bool
34 > df.iloc[:, 1:].min(1)
Out[203]: 
0    True
1    True
2    True
3    True
dtype: bool

iloc按位置对数据帧进行切片

df.iloc[:, 1:]
Out[204]: 
    IS  Val1  Val2  Val3
0  100    11     9     1
1  101     3    15    16
2   99    10    18     3
3   97    29    25    26

Answer 2

您的df2代码仅从标题为Val2和Val3的列中进行选择，但是只要您的df1的代码仍包含Val1，您仍然会在输出中看到Val1。

如果您使用列标题为数据建立索引并将新列添加到同一数据框中，可能会更容易看到发生的情况。

group1 = df[["Val1", "Val2", "Val3"]] # find the min among these 3 cols
group2 = df[["Val2", "Val3"]]   # find the min among only these 2 cols
df["min1"] = group1.min(axis=1)
df["col1"] = group1.idxmin(axis=1)
df["min2"] = group2.min(axis=1)
df["col2"] = group2.idxmin(axis=1)

filtered1 = df.loc[12 > df.min1]  # Val3, Val1, Val3 contain the minimum values
filtered2 = df.loc[12 > df.min2]  # Val3, Val3 contain the minimum values

使用熊猫选择数据的问题。 iloc

2 个答案: