DataFrame组合

时间:2016-11-07 21:29:54

标签: python pandas dataframe merge multi-index

我正在处理包含多个索引的大型multiIndex数据帧,例如SELECT * FROM E_TARIFFE WHERE COD_EVENTO = 1 AND PAGAMENTO_UNICO = 1 AND VALIDA_DAL >= CONVERT(DATE,GETDATE()) -- Today, 2016-11-07 AND VALIDA_AL <= CONVERT(DATE,GETDATE()) segmentperiod以及包含结果的多个列,例如classificationResults1。 DataFrame Results2应该存储我的所有计算结果:

consolidated_df

(大型DataFrame的)结构如下:

import pandas as pd
import numpy as np

segments = ['A', 'B', 'C']
periods = [1, 2]
classification = ['x', 'y']

index_constr = pd.MultiIndex.from_product(
    [segments, periods, classification],
    names=['Segment', 'Period', 'Classification'])

consolidated_df = pd.DataFrame(np.nan, index=index_constr,
                                       columns=['Results1', 'Results2'])

print(consolidated_df)

我正在我的所有 Results1 Results2 Segment Period Classification A 1 x NaN NaN y NaN NaN 2 x NaN NaN y NaN NaN B 1 x NaN NaN y NaN NaN 2 x NaN NaN y NaN NaN C 1 x NaN NaN y NaN NaN 2 x NaN NaN y NaN NaN segmentsAB)上运行for循环来计算结果(存储在DataFrame的列中)使用单独的函数C

此函数返回一个DataFrame,其格式与合并的DataFrame完全相同 - 除了它一次只报告一个段(即它是统一DataFrame的一个片段)。

示例:

calc_function

我尝试使用以下方法将结果DataFrame存储在合并的中,但未成功:

index_result = pd.MultiIndex.from_product(
    [['A'], periods, classification],
    names=['Segment', 'Period', 'Classification'])

result_calc = pd.DataFrame(np.random.randn(4,2), index=index_result, 
     columns=['Results1', 'Results2'])

print(result_calc)

                               Results1  Results2
Segment Period Classification                    
A       1      x              -1.568351  0.386250
               y               0.679170  1.552551
        2      x              -1.190928 -0.765319
               y               3.254929  1.436295

有没有办法轻松地将较小的DataFrame集成到合并的DataFrame中?

1 个答案:

答案 0 :(得分:1)

使用上面的示例,如何consolidated_df.ix['A'] = result_calc

(这与consolidated_df.ix['A', :, :] = result_calc相同)

print(consolidated_df)

                               Results1  Results2
Segment Period Classification                    
A       1      x               1.290466  0.228978
               y              -0.276959  0.735192
        2      x               0.757339 -0.787502
               y              -0.609848  0.805773
B       1      x                    NaN       NaN
               y                    NaN       NaN
        2      x                    NaN       NaN
               y                    NaN       NaN
C       1      x                    NaN       NaN
               y                    NaN       NaN
        2      x                    NaN       NaN
               y                    NaN       NaN