我有一个包含以下选举数据的数据框:
Date Winner
0 1910-04-13 ALP
1 1913-05-31 L+NP
2 1914-09-05 ALP
3 1917-05-05 L+NP
4 1919-12-13 L+NP
如何在数据集中未明确说明的给定日期计算当前当选方?
例如,当我尝试以下代码时,我会得到一个空系列
df['Winner'][df['Date'].dt.year == 1916]
如何从ALP的那一天起获得以前的选举结果?
答案 0 :(得分:0)
一种处理方法是创建新列并进行比较:
import pandas as pd
df['Date'] = pd.to_datetime(df['Date'])
df['End'] = df['Date'].shift(-1) - pd.Timedelta(days=1)
df['End'].fillna(pd.datetime.now(), inplace=True)
新的df
:
Date Winner End
0 1910-04-13 ALP 1913-05-30 00:00:00.000000
1 1913-05-31 L+NP 1914-09-04 00:00:00.000000
2 1914-09-05 ALP 1917-05-04 00:00:00.000000
3 1917-05-05 L+NP 1919-12-12 00:00:00.000000
4 1919-12-13 L+NP 2019-06-09 15:43:14.319334
然后使用逻辑比较:
q = pd.datetime(1916, 10, 1)
df['Winner'][(df['Date'] < q) & (q < df['End'])]
输出:
2 ALP
Name: Winner, dtype: object