合并/附加DataFrame会导致新列

时间:2016-08-10 12:00:49

标签: python pandas join merge append

我有以下合并操作:

data_static = pandas.merge(data_static, data_output[['TICKER', 'DATE', 'rolling_vola_40', 'rolling_vola_80', 'f_rolling_vola_40', 'f_rolling_vola_80', 'rolling_vola_prev_annum', 'rolling_vola_post_annum']], how='left', on=['TICKER', 'DATE'])

我现在的问题是,这导致以下标题:

;YEAR;DATE;TICKER;LONG_COMP_NAME;ISSUER_INDUSTRY;INDUSTRY_SECTOR;COUNTRY;ACCOUNTING_STANDARD;ACCOUNTING_STANDARD_OVERRIDE;EQY_FUND_CRNCY;INDEX;DATE_PREV;DATE_NEXT;rolling_vola_40_x;rolling_vola_80_x;f_rolling_vola_40_x;f_rolling_vola_80_x;rolling_vola_prev_annum_x;rolling_vola_post_annum_x;rolling_vola_40_y;rolling_vola_80_y;f_rolling_vola_40_y;f_rolling_vola_80_y;rolling_vola_prev_annum_y;rolling_vola_post_annum_y

我希望数据进入同一列,如下所示:

;YEAR;DATE;TICKER;LONG_COMP_NAME;ISSUER_INDUSTRY;INDUSTRY_SECTOR;COUNTRY;ACCOUNTING_STANDARD;ACCOUNTING_STANDARD_OVERRIDE;EQY_FUND_CRNCY;INDEX;DATE_PREV;DATE_NEXT;rolling_vola_40;rolling_vola_80;f_rolling_vola_40;f_rolling_vola_80;rolling_vola_prev_annum;rolling_vola_post_annum;

所以不要像这样(例子)彼此相邻:

    TICKER   Val1_x   Val2_x   Val3_x   Val1_y   Val2_y   Val3_y
    A        80       6        1        NaN      NaN      NaN
    B        NaN      NaN      NaN      10       12       14

我希望他们是这样的:

    TICKER   Val1     Val2     Val3
    A        80       6        1
    B        10       12       14

我的合并加入了TICKERDATE列,所以不要混淆样本数据。

1 个答案:

答案 0 :(得分:1)

这里的工作是首先使用append,然后在最后使用merge

data_store = pandas.DataFrame(columns=('TICKER', 'DATE', 'rolling_vola_40', 'rolling_vola_80', 'f_rolling_vola_40', 'f_rolling_vola_80', 'rolling_vola_prev_annum', 'rolling_vola_post_annum'))

for index, row in data_static.iterrows():
    data_output = vol(row['TICKER'], row['DATE'], row['DATE_PREV'], row['DATE_NEXT'])
    if type(data_output) != type(None):
        data_store = data_store.append(data_output[['TICKER', 'DATE', 'rolling_vola_40', 'rolling_vola_80', 'f_rolling_vola_40', 'f_rolling_vola_80', 'rolling_vola_prev_annum', 'rolling_vola_post_annum']])

data_static = pandas.merge(data_static, data_store[['TICKER', 'DATE', 'rolling_vola_40', 'rolling_vola_80', 'f_rolling_vola_40', 'f_rolling_vola_80', 'rolling_vola_prev_annum', 'rolling_vola_post_annum']], how='left', on=['TICKER', 'DATE'])
data_static.to_csv('test.csv', sep=';', encoding='utf-8')