pandas使用多列连接

时间:2018-04-25 20:06:50

标签: pandas

获得如下数据帧,并尝试根据" File_date"加入它们。和"符号"这两者都很常见。

select C1,C2,C3, max(C4) from T where ID = 'something';

想要

     df_clean.head(100)
              File_date Symbol hv20 hv50  hv100      Date  curiv    Days  Percentile  Close  Changed
        4609   20180423   ZYNE   68   64   64.0  180423.0  65.86   430.0        11.0  10.36        1

>>> df_clean.index
Int64Index([4609, 4611, 4608, 4606, 4603, 4600, 4609, 4607, 4604, 4604,
            ...
               0,    0,    0,    0,    0,    0,    0,    0,    0, 4617], dtype='int64', length=419721)





>>> df_allhv_to_date.head(100)
            hv5   hv10   dj20 Symbol
20180423  24.18  22.50  30.01      ZYNE

df_allhv_to_date.index
Int64Index([20171219, 20171220, 20171221, 20171222, 20171226, 20171227, 20171228, 20171229, 20180102, 20180103,
            ...
            20180410, 20180411, 20180412, 20180413, 20180416, 20180417, 20180418, 20180419, 20180420, 20180423], dtype='int64', length=425)

我试过

File_date Symbol hv5   hv10   dj20 hv20 hv50  hv100      Date  curiv    Days  Percentile  Close  Changed

20180413   ZYNE  23.04  34.22  30.61  73   67   65.0  180413.0  79.87   424.0        48.0  10.17        0 

但它没有用。我错过了什么?

1 个答案:

答案 0 :(得分:0)

您的问题是'df_allfv_to_date'中的日期数据位于索引中。所以,我认为我们首先命名索引,然后将数据移出索引并加入这些列,如下所示:

df_for_sql =  pd.merge(df_allhv_to_date.rename_axis('File_date', axis=0).reset_index(), 
                       df_clean,  
                       how='left', 
                       on = ['File_date','Symbol'])