使用熊猫根据另一列过滤数据框值

时间:2020-04-18 13:26:39

标签: python python-3.x pandas dataframe

具有包含值的df。


name     last_date                     submission_date

mike  2020-04-10 02:22:22.222   2020-04-01 02:22:22.222
mike  2020-04-10 02:22:22.222   2020-04-08 02:22:22.222
mike  2020-04-10 02:22:22.222   2020-04-16 02:22:22.222

ross  2020-04-16 02:22:22.222   2020-04-18 02:22:22.222
ross  2020-04-16 02:22:22.222   2020-04-19 02:22:22.222
ross  2020-04-16 02:22:22.222   2020-04-20 02:22:22.222
ross  2020-04-16 02:22:22.222   2020-04-15 02:22:22.222

carter 2020-04-22 02:22:22.222   2020-04-28 02:22:22.222
carter 2020-04-22 02:22:22.222   2020-04-15 02:22:22.222
carter 2020-04-22 02:22:22.222   2020-04-19 02:22:22.222
carter 2020-04-22 02:22:22.222   2020-04-21 02:22:22.222



根据last_date过滤值。如果submit_date大于last_date,则排除其值

预期输出:

name     last_date                     submission_date

mike  2020-04-10 02:22:22.222   2020-04-01 02:22:22.222
mike  2020-04-10 02:22:22.222   2020-04-08 02:22:22.222

ross  2020-04-16 02:22:22.222   2020-04-15 02:22:22.222

carter 2020-04-22 02:22:22.222   2020-04-15 02:22:22.222
carter 2020-04-22 02:22:22.222   2020-04-19 02:22:22.222
carter 2020-04-22 02:22:22.222   2020-04-21 02:22:22.222




1 个答案:

答案 0 :(得分:1)

您可以query submission_date小于或等于last_date的数据帧,这将返回满足条件的行,并过滤掉其余的行:

df.query("last_date>=submission_date")

    name                 last_date         submission_date
0   mike   2020-04-10 02:22:22.222 2020-04-01 02:22:22.222
1   mike   2020-04-10 02:22:22.222 2020-04-08 02:22:22.222
2   ross   2020-04-16 02:22:22.222 2020-04-15 02:22:22.222
3  carter  2020-04-22 02:22:22.222 2020-04-15 02:22:22.222
4  carter  2020-04-22 02:22:22.222 2020-04-19 02:22:22.222
5  carter  2020-04-22 02:22:22.222 2020-04-21 02:22:22.222