我有一个如下所示的pd.dataframe:
key_value date
value_01 2017-01-13
value_01 2018-02-17
value_01 2018-04-02
value_01 2018-05-13
value_01 2018-05-16
value_02 2017-01-18
value_02 2018-03-13
value_02 2018-04-01
value_02 2018-05-16
value_02 2018-05-22
value_03 2018-01-13
value_03 2018-04-14
现在基于key_value
,
我要删除日期列值在2018-04-01
之前的所有行
我想要这样的最终输出:
key_value date
value_01 2018-04-02
value_01 2018-05-13
value_01 2018-05-16
value_02 2018-04-01
value_02 2018-05-16
value_02 2018-05-22
value_03 2018-04-14
答案 0 :(得分:2)
您可以只使用布尔索引来过滤数据框。这里没有分组操作。只记得先将系列转换为datetime
。
df['date'] = pd.to_datetime(df['date'])
res = df[~(df['date'] < '2018-04-01')]
print(res)
key_value date
2 value_01 2018-04-02
3 value_01 2018-05-13
4 value_01 2018-05-16
7 value_02 2018-04-01
8 value_02 2018-05-16
9 value_02 2018-05-22
11 value_03 2018-04-14
答案 1 :(得分:0)
也许这段代码不是最好的,但是即使您的日期没有排序,您的要求也可以做到。
import pandas as pd
from datetime import datetime
d = {'key_value': [1, 2, 3, 4, 5], 'date': ['2017-01-13', '2018-02-17','2018-04-02','2018-05-13','2018-05-16']}#create dataframe
date_string='2018-04-01'#date limit
date_to_drop=datetime.strptime(date_string, '%Y-%m-%d')# conmert my date to datetime
i=0
l=len(d['date'])#len of your set of date
while i<l:#loop on your set of date
datetime_object = datetime.strptime(d['date'][i], '%Y-%m-%d')#convert the current date in datetime
if datetime_object<date_to_drop:#if my current date is previous of the date limit I delete it from my dataframe
d['date'].pop(i)#delete the date
d['key_value'].pop(i)#delete the key_value
l-=1#decrese the len of the date set of 1 seeing that I delete an element
else:#if my current date is after of date limit I just pass to next iteration
i+=1
df = pd.DataFrame(data=d)
print (df)
这是结果
date key_value
0 2018-04-02 3
1 2018-05-13 4
2 2018-05-16 5
答案 2 :(得分:0)
有点晚了,但这是我的解决方案。尝试使用一些没有熊猫的pythonic东西。也许更容易阅读。
from datetime import datetime
data = {}
specificDate = datetime.strptime("2018-04-01", "%Y-%m-%d")
data.update({"value_01" : ["2017-01-13", "2018-02-17", "2018-04-02", "2018-05-13", "2018-05-16"]})
data.update({"value_02" : ["2017-01-18", "2018-03-13", "2018-04-01", "2018-05-16", "2018-05-22"]})
data.update({"value_03" : ["2018-01-13", "2018-04-14"]})
for key in data.keys():
data.update({key : list(filter(lambda x: datetime.strptime(x, "%Y-%m-%d") >= specificDate ,data[key]))})
for key, value in data.items():
print(key)
for val in value:
print(" " + val)
输出:
value_01
2018-04-02
2018-05-13
2018-05-16
value_02
2018-04-01
2018-05-16
2018-05-22
value_03
2018-04-14