在熊猫中填充数据框

时间:2018-11-05 10:38:03

标签: python pandas dataframe

我正在尝试在熊猫中填充数据框。我试图创建一个字典,然后将其放入数据帧中,但是没有用。

这是我当前的代码:

holidays_dic= {
    'Half_Summer17'   :{'26-05-2017':'01-06-2017'}
    ,'Summer17'       :{'21-07-2017':'31-08-2017'}
    ,'Half_Fall17'    :{'20-10-2017':'26-10-2017'}
    ,'Xmas17'         :{'20-12-2017':'02-01-2018'}
    ,'Half_Spring18'  :{'12-02-2018':'16-02-2018'}
    ,'Easter18'       :{'30-03-2018':'13-04-2018'}
    ,'Half_Summer18'  :{'28-05-2018':'01-06-2018'}
    ,'Summer18'       :{'25-07-2018':'04-09-2018'}
    ,'Half_Fall18'    :{'22-10-2018':'25-10-2018'}
    ,'Xmas18'         :{'20-12-2018':'03-01-2018'}
 #   ,'Half_Spring19'  :{'01-01-2017':'01-01-2017'}
 #   ,'Easter19'       :{'01-01-2017':'01-01-2017'}
}

df_holidays=pd.DataFrame(holidays_dic,)

#holidays_dic
df_holidays

我想要的输出是这样的:

index           sDate     eDate
Half_Summer17   26-05-17  01-06-17
Summer 17       21-07-17   31-08-17
etc

有人有什么想法吗?

3 个答案:

答案 0 :(得分:3)

您可以执行以下操作:

import pandas as pd

holidays_dic = {
    'Half_Summer17': {'26-05-2017': '01-06-2017'}
    , 'Summer17': {'21-07-2017': '31-08-2017'}
    , 'Half_Fall17': {'20-10-2017': '26-10-2017'}
    , 'Xmas17': {'20-12-2017': '02-01-2018'}
    , 'Half_Spring18': {'12-02-2018': '16-02-2018'}
    , 'Easter18': {'30-03-2018': '13-04-2018'}
    , 'Half_Summer18': {'28-05-2018': '01-06-2018'}
    , 'Summer18': {'25-07-2018': '04-09-2018'}
    , 'Half_Fall18': {'22-10-2018': '25-10-2018'}
    , 'Xmas18': {'20-12-2018': '03-01-2018'}
}

data = [[holidays, start, end] for holidays, date_range in holidays_dic.items() for start, end in date_range.items()]
df = pd.DataFrame(data=data, columns=['holiday', 'sDate', 'eDate']).set_index(['holiday'])
print(df)

输出

                    sDate       eDate
holiday                              
Half_Summer18  28-05-2018  01-06-2018
Easter18       30-03-2018  13-04-2018
Xmas18         20-12-2018  03-01-2018
Xmas17         20-12-2017  02-01-2018
Half_Fall17    20-10-2017  26-10-2017
Half_Summer17  26-05-2017  01-06-2017
Summer18       25-07-2018  04-09-2018
Half_Fall18    22-10-2018  25-10-2018
Summer17       21-07-2017  31-08-2017
Half_Spring18  12-02-2018  16-02-2018

答案 1 :(得分:3)

另一种方法。

df = pd.DataFrame(holidays_dic).T.stack().reset_index(level=1)

df = df.rename(columns = {'level_1':'sDate', 0:'eDate'}) # Rename columns.

print (df)
                    sDate       eDate
Half_Summer17  26-05-2017  01-06-2017
Summer17       21-07-2017  31-08-2017
Half_Fall17    20-10-2017  26-10-2017
Xmas17         20-12-2017  02-01-2018
Half_Spring18  12-02-2018  16-02-2018
Easter18       30-03-2018  13-04-2018
Half_Summer18  28-05-2018  01-06-2018
Summer18       25-07-2018  04-09-2018
Half_Fall18    22-10-2018  25-10-2018
Xmas18         20-12-2018  03-01-2018

答案 2 :(得分:1)

使用-

holidays_dic = {'Half_Summer17':['26-05-2017','01-06-2017'], 'Summer17':['21-07-2017','31-08-2017']}

df_holidays=pd.DataFrame.from_dict(holidays_dic, orient='index')
df_holidays.columns=['sDate', 'eDate']

OR

holidays_dic = {'sDate':['26-05-2017','21-07-2017'], 'eDate':['01-06-2017','31-08-2017'], 'index':['Half_Summer17', 'Summer17']}

df_holidays=pd.DataFrame.from_dict(holidays_dic)
df_holidays = df_holidays.set_index('index')

输出

                sDate        eDate
Half_Summer17   26-05-2017  01-06-2017
Summer17    21-07-2017  31-08-2017

时间

@Vivek [1st]

527 µs ± 140 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

@Vivek [2nd]

1.12 ms ± 169 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

@Sai Kumar

3.22 ms ± 416 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@Daniel

1.21 ms ± 235 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)