我有日期范围和数据框的列表,如下所示: [('2019-01-01','2019-01-04'), ('2019-12-25','2019-12-28'), ('2019-18-29','2019-12-21'),]
+------------+---+------+
| date | id| |
+------------+---+------+
| 2018-01-04 | 1 | |
| 2018-01-02 | 1 | |
| 2018-01-01 | 1 | |
| 2017-12-28 | 1 | |
| 2017-12-27 | 1 | |
| 2017-12-26 | 1 | |
| 2017-12-25 | 1 | |
| 2017-12-21 | 1 | |
| 2017-12-20 | 1 | |
| 2017-12-18 | 1 | |
+------------+---+------+
预期输出:
+------------+---+------+-------+
| date |id | group| |
+------------+---+------+-------+
| 2018-01-04 | 1 | 1 | |
| 2018-01-02 | 1 | 1 | |
| 2018-01-01 | 1 | 1 | |
| 2017-12-28 | 1 | 2 | |
| 2017-12-27 | 1 | 2 | |
| 2017-12-26 | 1 | 2 | |
| 2017-12-25 | 1 | 2 | |
| 2017-12-21 | 1 | 3 | |
| 2017-12-20 | 1 | 3 | |
| 2017-12-18 | 1 | 3 | |
+------------+---+------+-------+
我尝试使用列表推导为日期<=“ 2019-01-04”和日期> =“ 2019-01-01”分配1,依此类推,但它不起作用。 有人可以帮助我吗?
答案 0 :(得分:1)
这应该做到:
import pandas as pd
df['date'] = pd.to_datetime(df['date'])
def f(x):
if (x <= pd.Timestamp('2018-01-04')) & (x >= pd.Timestamp('2018-01-01')):
return(1)
elif (x <= pd.Timestamp('2017-12-28')) & (x >= pd.Timestamp('2017-12-25')):
return(2)
elif (x <= pd.Timestamp('2017-12-20')) & (x >= pd.Timestamp('2017-12-18')):
return(3)
df['group'] = df['date'].apply(f)
编辑:
或者,您可以执行以下操作:
date_ranges = [pd.date_range(start='2018-01-04', end='2018-01-01'),
pd.date_range(start='2017/12/25', end='2017/12/28'),
pd.date_range(start='2017/12/18', end='2017/12/20'),
]
df['group'] = df['date'].apply(lambda x: [i for i, date_rng in enumerate(date_ranges) if x in date_rng][0])