获得如下数据帧,并尝试根据" File_date"加入它们。和"符号"这两者都很常见。
select C1,C2,C3, max(C4) from T where ID = 'something';
想要
df_clean.head(100)
File_date Symbol hv20 hv50 hv100 Date curiv Days Percentile Close Changed
4609 20180423 ZYNE 68 64 64.0 180423.0 65.86 430.0 11.0 10.36 1
>>> df_clean.index
Int64Index([4609, 4611, 4608, 4606, 4603, 4600, 4609, 4607, 4604, 4604,
...
0, 0, 0, 0, 0, 0, 0, 0, 0, 4617], dtype='int64', length=419721)
>>> df_allhv_to_date.head(100)
hv5 hv10 dj20 Symbol
20180423 24.18 22.50 30.01 ZYNE
df_allhv_to_date.index
Int64Index([20171219, 20171220, 20171221, 20171222, 20171226, 20171227, 20171228, 20171229, 20180102, 20180103,
...
20180410, 20180411, 20180412, 20180413, 20180416, 20180417, 20180418, 20180419, 20180420, 20180423], dtype='int64', length=425)
我试过
File_date Symbol hv5 hv10 dj20 hv20 hv50 hv100 Date curiv Days Percentile Close Changed
20180413 ZYNE 23.04 34.22 30.61 73 67 65.0 180413.0 79.87 424.0 48.0 10.17 0
但它没有用。我错过了什么?
答案 0 :(得分:0)
您的问题是'df_allfv_to_date'中的日期数据位于索引中。所以,我认为我们首先命名索引,然后将数据移出索引并加入这些列,如下所示:
df_for_sql = pd.merge(df_allhv_to_date.rename_axis('File_date', axis=0).reset_index(),
df_clean,
how='left',
on = ['File_date','Symbol'])