Question

让我们假设我们有一个具有三个功能的熊猫数据帧，如下所示。

每行代表一个客户，每列代表该客户的一些功能。

我想获取行号并将它们添加到列表中，或者根据其功能值不将它们添加到列表中。

让我们说，如果FEATUREA小于100或FEATUREB超过500，我们希望找到行号。

我已经为此编写了一些代码，如下所示。

import pandas as pd

d = [{'feature1': 100, 'feature2': 520, 'feature3': 54},
     {'feature1': 102, 'feature2': 504, 'feature3': 51},
     {'feature1': 241, 'feature2': 124, 'feature3': 4},
     {'feature1': 340, 'feature2': 830, 'feature3': 700},
     {'feature1': 98, 'feature2': 430, 'feature3': 123}]

df = DataFrame(d)
print(df)
print("----")

dataframe1 = df[(df['feature1'] < 100)]
dataframe2 = df[(df['feature2'] > 500)]

print(dataframe1)
print(dataframe2)
# here I would like to get row number temp and add them to result list

节目输出

      feature1  feature2  feature3
0       100       520        54
1       102       504        51
2       241       124         4
3       340       830       700
4        98       430       123
----
   feature1  feature2  feature3
4        98       430       123
   feature1  feature2  feature3
0       100       520        54
1       102       504        51
3       340       830       700

我无法弄清楚如何组合dataframe1和dataframe2，然后获取他们的行号。如果你知道怎么做，可以分享吗？

我希望看到像这样的结果列表

result = [ 4, 0, 1, 3]

Answer 1

不是很清楚......但也许这个：

df.query('feature1 < 100 | feature2 > 500').index.tolist()

[0, 1, 3, 4]

Answer 2

这样怎么样？

ls = []

ls.extend(df.index[(df['feature1'] < 100 )])
ls.extend(df.index[(df['feature2'] > 500 )])

print(ls)
[4, 0, 1, 3]

Answer 3

您希望将索引输出为列表。

print(df[df['feature2'] > 500].index.tolist())

[0, 1, 3]

pandas dataframe获取行号并添加到列表中

3 个答案: