我有2个数据帧:
DF_A
datetime var
2016-10-15 110.232790
2016-10-16 111.020661
2016-10-17 112.193496
2016-10-18 113.638143
2016-10-19 115.241448
2017-01-01 113.638143
2017-01-02 115.241448
和df_b
datetime var
2000-01-01 165.792185
2000-01-02 166.066959
2000-01-03 166.411669
2000-01-04 167.816046
2000-01-05 169.777814
2000-10-15 114.232790
2000-10-16 113.020661
2001-01-01 164.792185
2001-01-02 161.066959
2001-01-03 156.411669
2002-01-04 167.816046
2002-01-05 169.777814
2002-10-15 174.232790
2003-10-16 114.020661
df_a包含2016年,2017年的信息,df_b拥有2000年至2015年的信息(这些年份没有重叠)。
我是否可以将df_b数据框中的每个组安排为与df_a相同的日期顺序?组被定义为具有相同年份的行,例如2000
答案 0 :(得分:1)
您可以链接新条件以进行检查year
:
df = df_b[df_b.index.month.isin(df_a.index.month) &
df_b.index.day.isin(df_a.index.day) &
(df_b.index.year == 2000)]
print (df)
var
datetime
2000-01-01 165.792185
2000-01-02 166.066959
2000-10-15 114.232790
2000-10-16 113.020661
编辑:
df = df_b[df_b.index.month.isin(df_a.index.month) & df_b.index.day.isin(df_a.index.day)]
print (df)
var
datetime
2000-01-01 165.792185
2000-01-02 166.066959
2000-10-15 114.232790
2000-10-16 113.020661
2001-01-01 164.792185
2001-01-02 161.066959
2002-10-15 174.232790
2003-10-16 114.020661
#create dictionary of weights by factorize
a = pd.factorize(df_a.index.strftime('%m-%d'))
d = dict(zip(a[1], a[0]))
print (d)
{'01-02': 6, '10-19': 4, '10-18': 3, '10-15': 0, '01-01': 5, '10-16': 1, '10-17': 2}
#ordering Series, multiple by 1000 becasue possible 1 to 366 MMDD
order = pd.Series(df.index.strftime('%m-%d'), index=df.index).map(d) + df.index.year * 1000
print (order)
datetime
2000-01-01 2000005
2000-01-02 2000006
2000-10-15 2000000
2000-10-16 2000001
2001-01-01 2001005
2001-01-02 2001006
2002-10-15 2002000
2003-10-16 2003001
Name: datetime, dtype: int64
排序order
索引的最后reindex
:
df = df.reindex(order.sort_values().index)
print (df)
var
datetime
2000-10-15 114.232790
2000-10-16 113.020661
2000-01-01 165.792185
2000-01-02 166.066959
2001-01-01 164.792185
2001-01-02 161.066959
2002-10-15 174.232790
2003-10-16 114.020661