Pandas Dataframe:填补缺失的月份

时间:2016-02-02 17:34:10

标签: python datetime pandas time-series

我已经看过Panda Timeseries这样做了,但希望得到一些Dataframes的帮助。我有一个1966-2009的月度值文件。我没有1985年的数据,也想添加2010/2011的数据。这些添加物只会附加NaN。

使用下面的代码,我试图剪切我的数据集,使其从1980年开始,然后在附加NaN值时缺少的年份中添加。但是,没有任何东西被削减,也没有添加任我有什么遗失的东西吗?

idx = pd.date_range('01-01-1980','12-31-2011',freq='M')
years = np.arange(1980,2012,1)

obdata = pd.read_table('H:/Dissertation/Data/Observations/'+str(int(statid[e]))+'.txt',index_col='datetime',parse_dates={'datetime':[3,4]},date_parser=lambda x: pd.datetime.strptime(x, '%Y %m'),keep_date_col='True')
o_d = obdata.loc[obdata['Year'].isin(years)]
o_d['sd'] = np.nan_to_num(o_d['sd'])
o_d.reindex(idx,fill_value='NaN')

grouped_same.append(o_d['sd'])

现在提供了一些示例数据。正如你所看到的,1985年缺失,数据在2009年停止。我将把所有的NaN变成0,我想用“sd”填充缺失的月度数据。作为NaN。

ID      Lat     Lon     Year Month  sd
32539   53.12   157.75  1978     1  127.00
32539   53.12   157.75  1978     2  150.00
32539   53.12   157.75  1978     3  152.00
32539   53.12   157.75  1978     4  139.00
32539   53.12   157.75  1978     5  63.00
32539   53.12   157.75  1978     6  NaN
32539   53.12   157.75  1978     7  NaN
32539   53.12   157.75  1978     8  NaN
32539   53.12   157.75  1978     9  NaN
32539   53.12   157.75  1978    10  2.00
32539   53.12   157.75  1978    11  17.50
32539   53.12   157.75  1978    12  79.00
32539   53.12   157.75  1979     1  72.00
32539   53.12   157.75  1979     2  113.00
32539   53.12   157.75  1979     3  129.00
32539   53.12   157.75  1979     4  109.67
32539   53.12   157.75  1979     5  51.00
32539   53.12   157.75  1979     6  NaN
32539   53.12   157.75  1979     7  NaN
32539   53.12   157.75  1979     8  NaN
32539   53.12   157.75  1979     9  NaN
32539   53.12   157.75  1979    10  22.50
32539   53.12   157.75  1979    11  68.67
32539   53.12   157.75  1979    12  90.00
32539   53.12   157.75  1980     1  183.00
32539   53.12   157.75  1980     2  217.00
32539   53.12   157.75  1980     3  218.00
32539   53.12   157.75  1980     4  201.50
32539   53.12   157.75  1980     5  133.67
32539   53.12   157.75  1980     6  32.00
32539   53.12   157.75  1980     7  NaN
32539   53.12   157.75  1980     8  NaN
32539   53.12   157.75  1980     9  NaN
32539   53.12   157.75  1980    10  20.50
32539   53.12   157.75  1980    11  56.67
32539   53.12   157.75  1980    12  78.33
32539   53.12   157.75  1981     1  108.33
32539   53.12   157.75  1981     2  125.33
32539   53.12   157.75  1981     3  124.00
32539   53.12   157.75  1981     4  108.67
32539   53.12   157.75  1981     5  42.00
32539   53.12   157.75  1981     6  NaN
32539   53.12   157.75  1981     7  NaN
32539   53.12   157.75  1981     8  NaN
32539   53.12   157.75  1981     9  NaN
32539   53.12   157.75  1981    10  16.00
32539   53.12   157.75  1981    11  38.67
32539   53.12   157.75  1981    12  66.33
32539   53.12   157.75  1982     1  94.33
32539   53.12   157.75  1982     2  131.33
32539   53.12   157.75  1982     3  127.33
32539   53.12   157.75  1982     4  101.33
32539   53.12   157.75  1982     5  29.33
32539   53.12   157.75  1982     6  NaN
32539   53.12   157.75  1982     7  NaN
32539   53.12   157.75  1982     8  NaN
32539   53.12   157.75  1982     9  NaN
32539   53.12   157.75  1982    10  12.50
32539   53.12   157.75  1982    11  28.33
32539   53.12   157.75  1982    12  66.67
32539   53.12   157.75  1983     1  108.33
32539   53.12   157.75  1983     2  121.33
32539   53.12   157.75  1983     3  133.67
32539   53.12   157.75  1983     4  128.33
32539   53.12   157.75  1983     5  66.00
32539   53.12   157.75  1983     6  NaN
32539   53.12   157.75  1983     7  NaN
32539   53.12   157.75  1983     8  NaN
32539   53.12   157.75  1983     9  NaN
32539   53.12   157.75  1983    10  11.00
32539   53.12   157.75  1983    11  35.33
32539   53.12   157.75  1983    12  72.00
32539   53.12   157.75  1984     1  87.33
32539   53.12   157.75  1984     2  163.00
32539   53.12   157.75  1984     3  185.00
32539   53.12   157.75  1984     4  154.33
32539   53.12   157.75  1984     5  79.00
32539   53.12   157.75  1984     6  NaN
32539   53.12   157.75  1984     7  NaN
32539   53.12   157.75  1984     8  NaN
32539   53.12   157.75  1984     9  NaN
32539   53.12   157.75  1984    10  NaN
32539   53.12   157.75  1984    11  44.67
32539   53.12   157.75  1984    12  76.33
32539   53.12   157.75  1986     1  148.33
32539   53.12   157.75  1986     2  160.00
32539   53.12   157.75  1986     3  178.00
32539   53.12   157.75  1986     4  131.00
32539   53.12   157.75  1986     5  61.33
32539   53.12   157.75  1986     6  NaN
32539   53.12   157.75  1986     7  NaN
32539   53.12   157.75  1986     8  NaN
32539   53.12   157.75  1986     9  NaN
32539   53.12   157.75  1986    10  NaN
32539   53.12   157.75  1986    11  NaN
32539   53.12   157.75  1986    12  NaN
32539   53.12   157.75  1987     1  73.00
32539   53.12   157.75  1987     2  102.67
32539   53.12   157.75  1987     3  142.33
32539   53.12   157.75  1987     4  128.33
32539   53.12   157.75  1987     5  75.50
32539   53.12   157.75  1987     6  NaN
32539   53.12   157.75  1987     7  NaN
32539   53.12   157.75  1987     8  NaN
32539   53.12   157.75  1987     9  NaN
32539   53.12   157.75  1987    10  19.00
32539   53.12   157.75  1987    11  56.67
32539   53.12   157.75  1987    12  84.50
32539   53.12   157.75  1988     1  98.33
32539   53.12   157.75  1988     2  120.33
32539   53.12   157.75  1988     3  144.67
32539   53.12   157.75  1988     4  134.33
32539   53.12   157.75  1988     5  70.33
32539   53.12   157.75  1988     6  NaN
32539   53.12   157.75  1988     7  NaN
32539   53.12   157.75  1988     8  NaN
32539   53.12   157.75  1988     9  NaN
32539   53.12   157.75  1988    10  NaN
32539   53.12   157.75  1988    11  58.67
32539   53.12   157.75  1988    12  109.67
32539   53.12   157.75  1989     1  113.00
32539   53.12   157.75  1989     2  156.00
32539   53.12   157.75  1989     3  181.00
32539   53.12   157.75  1989     4  168.00
32539   53.12   157.75  1989     5  NaN
32539   53.12   157.75  1989     6  NaN
32539   53.12   157.75  1989     7  NaN
32539   53.12   157.75  1989     8  NaN
32539   53.12   157.75  1989     9  NaN
32539   53.12   157.75  1989    10  8.00
32539   53.12   157.75  1989    11  46.00
32539   53.12   157.75  1989    12  92.67
32539   53.12   157.75  1990     1  102.67
32539   53.12   157.75  1990     2  131.67
32539   53.12   157.75  1990     3  153.50
32539   53.12   157.75  1990     4  132.00
32539   53.12   157.75  1990     5  53.25
32539   53.12   157.75  1990     6  NaN
32539   53.12   157.75  1990     7  NaN
32539   53.12   157.75  1990     8  NaN
32539   53.12   157.75  1990     9  NaN
32539   53.12   157.75  1990    10  NaN
32539   53.12   157.75  1990    11  28.00
32539   53.12   157.75  1990    12  56.00
32539   53.12   157.75  1991     1  83.33
32539   53.12   157.75  1991     2  118.00
32539   53.12   157.75  1991     3  118.00
32539   53.12   157.75  1991     4  127.67
32539   53.12   157.75  1991     5  69.50
32539   53.12   157.75  1991     6  NaN
32539   53.12   157.75  1991     7  NaN
32539   53.12   157.75  1991     8  NaN
32539   53.12   157.75  1991     9  NaN
32539   53.12   157.75  1991    10  18.00
32539   53.12   157.75  1991    11  27.00
32539   53.12   157.75  1991    12  56.00
32539   53.12   157.75  1992     1  62.00
32539   53.12   157.75  1992     2  107.00
32539   53.12   157.75  1992     3  133.67
32539   53.12   157.75  1992     4  122.67
32539   53.12   157.75  1992     5  74.25
32539   53.12   157.75  1992     6  NaN
32539   53.12   157.75  1992     7  NaN
32539   53.12   157.75  1992     8  NaN
32539   53.12   157.75  1992     9  NaN
32539   53.12   157.75  1992    10  3.00
32539   53.12   157.75  1992    11  33.33
32539   53.12   157.75  1992    12  64.33
32539   53.12   157.75  1993     1  80.67
32539   53.12   157.75  1993     2  96.00
32539   53.12   157.75  1993     3  101.67
32539   53.12   157.75  1993     4  120.00
32539   53.12   157.75  1993     5  78.33
32539   53.12   157.75  1993     6  NaN
32539   53.12   157.75  1993     7  NaN
32539   53.12   157.75  1993     8  NaN
32539   53.12   157.75  1993     9  NaN
32539   53.12   157.75  1993    10  10.00
32539   53.12   157.75  1993    11  36.67
32539   53.12   157.75  1993    12  81.00
32539   53.12   157.75  1994     1  125.00
32539   53.12   157.75  1994     2  209.00
32539   53.12   157.75  1994     3  199.00
32539   53.12   157.75  1994     4  205.00
32539   53.12   157.75  1994     5  132.00
32539   53.12   157.75  1994     6  49.00
32539   53.12   157.75  1994     7  NaN
32539   53.12   157.75  1994     8  NaN
32539   53.12   157.75  1994     9  NaN
32539   53.12   157.75  1994    10  3.00
32539   53.12   157.75  1994    11  34.33
32539   53.12   157.75  1994    12  61.00
32539   53.12   157.75  1995     1  76.33
32539   53.12   157.75  1995     2  96.00
32539   53.12   157.75  1995     3  125.33
32539   53.12   157.75  1995     4  142.00
32539   53.12   157.75  1995     5  43.75
32539   53.12   157.75  1995     6  NaN
32539   53.12   157.75  1995     7  NaN
32539   53.12   157.75  1995     8  NaN
32539   53.12   157.75  1995     9  NaN
32539   53.12   157.75  1995    10  NaN
32539   53.12   157.75  1995    11  34.67
32539   53.12   157.75  1995    12  55.67
32539   53.12   157.75  1996     1  142.50
32539   53.12   157.75  1996     2  162.00
32539   53.12   157.75  1996     3  152.00
32539   53.12   157.75  1996     4  191.00
32539   53.12   157.75  1996     5  85.33
32539   53.12   157.75  1996     6  NaN
32539   53.12   157.75  1996     7  NaN
32539   53.12   157.75  1996     8  NaN
32539   53.12   157.75  1996     9  NaN
32539   53.12   157.75  1996    10  4.00
32539   53.12   157.75  1996    11  41.00
32539   53.12   157.75  1996    12  99.00
32539   53.12   157.75  1997     1  185.00
32539   53.12   157.75  1997     2  232.00
32539   53.12   157.75  1997     3  239.00
32539   53.12   157.75  1997     4  218.00
32539   53.12   157.75  1997     5  141.50
32539   53.12   157.75  1997     6  43.00
32539   53.12   157.75  1997     7  NaN
32539   53.12   157.75  1997     8  NaN
32539   53.12   157.75  1997     9  NaN
32539   53.12   157.75  1997    10  NaN
32539   53.12   157.75  1997    11  29.00
32539   53.12   157.75  1997    12  80.33
32539   53.12   157.75  1998     1  121.33
32539   53.12   157.75  1998     2  130.33
32539   53.12   157.75  1998     3  127.00
32539   53.12   157.75  1998     4  123.67
32539   53.12   157.75  1998     5  85.67
32539   53.12   157.75  1998     6  NaN
32539   53.12   157.75  1998     7  NaN
32539   53.12   157.75  1998     8  NaN
32539   53.12   157.75  1998     9  NaN
32539   53.12   157.75  1998    10  26.00
32539   53.12   157.75  1998    11  45.67
32539   53.12   157.75  1998    12  78.33
32539   53.12   157.75  1999     1  132.50
32539   53.12   157.75  1999     2  142.00
32539   53.12   157.75  1999     3  168.00
32539   53.12   157.75  1999     4  150.50
32539   53.12   157.75  1999     5  77.33
32539   53.12   157.75  1999     6  NaN
32539   53.12   157.75  1999     7  NaN
32539   53.12   157.75  1999     8  NaN
32539   53.12   157.75  1999     9  NaN
32539   53.12   157.75  1999    10  7.00
32539   53.12   157.75  1999    11  43.33
32539   53.12   157.75  1999    12  83.67
32539   53.12   157.75  2000     1  90.00
32539   53.12   157.75  2000     2  94.00
32539   53.12   157.75  2000     3  98.00
32539   53.12   157.75  2000     4  91.00
32539   53.12   157.75  2000     5  48.00
32539   53.12   157.75  2000     6  NaN
32539   53.12   157.75  2000     7  NaN
32539   53.12   157.75  2000     8  NaN
32539   53.12   157.75  2000     9  NaN
32539   53.12   157.75  2000    10  34.00
32539   53.12   157.75  2000    11  56.67
32539   53.12   157.75  2000    12  67.67
32539   53.12   157.75  2002     1  130.67
32539   53.12   157.75  2002     2  121.00
32539   53.12   157.75  2002     3  136.00
32539   53.12   157.75  2002     4  151.50
32539   53.12   157.75  2002     5  51.00
32539   53.12   157.75  2002     6  NaN
32539   53.12   157.75  2002     7  NaN
32539   53.12   157.75  2002     8  NaN
32539   53.12   157.75  2002     9  NaN
32539   53.12   157.75  2002    10  12.50
32539   53.12   157.75  2002    11  23.33
32539   53.12   157.75  2002    12  48.33
32539   53.12   157.75  2003     1  71.00
32539   53.12   157.75  2003     2  91.67
32539   53.12   157.75  2003     3  105.00
32539   53.12   157.75  2003     4  100.67
32539   53.12   157.75  2003     5  61.00
32539   53.12   157.75  2003     6  NaN
32539   53.12   157.75  2003     7  NaN
32539   53.12   157.75  2003     8  NaN
32539   53.12   157.75  2003     9  NaN
32539   53.12   157.75  2003    10  NaN
32539   53.12   157.75  2003    11  34.33
32539   53.12   157.75  2003    12  76.33
32539   53.12   157.75  2004     1  109.00
32539   53.12   157.75  2004     2  128.33
32539   53.12   157.75  2004     3  138.33
32539   53.12   157.75  2004     4  127.50
32539   53.12   157.75  2004     5  66.67
32539   53.12   157.75  2004     6  NaN
32539   53.12   157.75  2004     7  NaN
32539   53.12   157.75  2004     8  NaN
32539   53.12   157.75  2004     9  NaN
32539   53.12   157.75  2004    10  6.50
32539   53.12   157.75  2004    11  52.00
32539   53.12   157.75  2004    12  105.67
32539   53.12   157.75  2005     1  156.00
32539   53.12   157.75  2005     2  205.00
32539   53.12   157.75  2005     3  273.00
32539   53.12   157.75  2005     4  216.00
32539   53.12   157.75  2005     5  117.00
32539   53.12   157.75  2005     6  41.00
32539   53.12   157.75  2005     7  NaN
32539   53.12   157.75  2005     8  NaN
32539   53.12   157.75  2005     9  NaN
32539   53.12   157.75  2005    10  10.00
32539   53.12   157.75  2005    11  38.00
32539   53.12   157.75  2005    12  108.00
32539   53.12   157.75  2006     1  191.50
32539   53.12   157.75  2006     2  199.00
32539   53.12   157.75  2006     3  195.00
32539   53.12   157.75  2006     4  209.00
32539   53.12   157.75  2006     5  109.50
32539   53.12   157.75  2006     6  44.50
32539   53.12   157.75  2006     7  NaN
32539   53.12   157.75  2006     8  NaN
32539   53.12   157.75  2006     9  NaN
32539   53.12   157.75  2006    10  9.00
32539   53.12   157.75  2006    11  8.33
32539   53.12   157.75  2006    12  27.33
32539   53.12   157.75  2007     1  54.33
32539   53.12   157.75  2007     2  67.67
32539   53.12   157.75  2007     3  145.67
32539   53.12   157.75  2007     4  124.00
32539   53.12   157.75  2007     5  55.00
32539   53.12   157.75  2007     6  NaN
32539   53.12   157.75  2007     7  NaN
32539   53.12   157.75  2007     8  NaN
32539   53.12   157.75  2007     9  16.00
32539   53.12   157.75  2007    10  1.50
32539   53.12   157.75  2007    11  36.00
32539   53.12   157.75  2007    12  74.00
32539   53.12   157.75  2008     1  119.67
32539   53.12   157.75  2008     2  125.50
32539   53.12   157.75  2008     3  153.00
32539   53.12   157.75  2008     4  124.00
32539   53.12   157.75  2008     5  43.25
32539   53.12   157.75  2008     6  NaN
32539   53.12   157.75  2008     7  NaN
32539   53.12   157.75  2008     8  NaN
32539   53.12   157.75  2008     9  NaN
32539   53.12   157.75  2008    10  16.00
32539   53.12   157.75  2008    11  56.00
32539   53.12   157.75  2008    12  103.00
32539   53.12   157.75  2009     1  NaN
32539   53.12   157.75  2009     2  181.00
32539   53.12   157.75  2009     3  190.00
32539   53.12   157.75  2009     4  175.00
32539   53.12   157.75  2009     5  81.00
32539   53.12   157.75  2009     6  NaN
32539   53.12   157.75  2009     7  NaN
32539   53.12   157.75  2009     8  NaN
32539   53.12   157.75  2009     9  NaN
32539   53.12   157.75  2009    10  14.00
32539   53.12   157.75  2009    11  43.67
32539   53.12   157.75  2009    12  79.00

1 个答案:

答案 0 :(得分:1)

我认为通过to_period转换periodindex可以更好地使用datetimeindex

您可以在sd栏中0np.nan,并使用fillna填写import pandas as pd import numpy as np obdata = pd.read_csv('H:/Dissertation/Data/Observations/'+str(int(statid[e]))+'.txt', index_col='datetime', parse_dates={'datetime':[3,4]},date_parser=lambda x: pd.datetime.strptime(x, '%Y %m'),keep_date_col='True') #print obdata.head(100) #convert string column Year to int (maybe you can omit it) obdata['Year'] = obdata['Year'].astype(int) #change datetimeindex to periodindex idx = pd.date_range('01-01-1980','12-01-2011',freq='M').to_period('m') years = np.arange(1980,2012,1) #change datetimeindex to periodindex obdata.index = obdata.index.to_period('m') #fill NaN to 0 in column sd obdata['sd'] = obdata['sd'].fillna(0) o_d = obdata.loc[obdata['Year'].isin(years)] #reindex and add NaN o_d = o_d.reindex(idx,fill_value=np.nan)

            ID    Lat     Lon  Year Month      sd
1980-01  32539  53.12  157.75  1980     1  183.00
1980-02  32539  53.12  157.75  1980     2  217.00
1980-03  32539  53.12  157.75  1980     3  218.00
1980-04  32539  53.12  157.75  1980     4  201.50
1980-05  32539  53.12  157.75  1980     5  133.67
1980-06  32539  53.12  157.75  1980     6   32.00
1980-07  32539  53.12  157.75  1980     7    0.00
1980-08  32539  53.12  157.75  1980     8    0.00
1980-09  32539  53.12  157.75  1980     9    0.00
1980-10  32539  53.12  157.75  1980    10   20.50
1980-11  32539  53.12  157.75  1980    11   56.67
1980-12  32539  53.12  157.75  1980    12   78.33
1981-01  32539  53.12  157.75  1981     1  108.33
1981-02  32539  53.12  157.75  1981     2  125.33
1981-03  32539  53.12  157.75  1981     3  124.00
1981-04  32539  53.12  157.75  1981     4  108.67
1981-05  32539  53.12  157.75  1981     5   42.00
1981-06  32539  53.12  157.75  1981     6    0.00
1981-07  32539  53.12  157.75  1981     7    0.00
1981-08  32539  53.12  157.75  1981     8    0.00
1981-09  32539  53.12  157.75  1981     9    0.00
1981-10  32539  53.12  157.75  1981    10   16.00
1981-11  32539  53.12  157.75  1981    11   38.67
1981-12  32539  53.12  157.75  1981    12   66.33
1982-01  32539  53.12  157.75  1982     1   94.33
1982-02  32539  53.12  157.75  1982     2  131.33
1982-03  32539  53.12  157.75  1982     3  127.33
1982-04  32539  53.12  157.75  1982     4  101.33
1982-05  32539  53.12  157.75  1982     5   29.33
1982-06  32539  53.12  157.75  1982     6    0.00
...        ...    ...     ...   ...   ...     ...
1984-03  32539  53.12  157.75  1984     3  185.00
1984-04  32539  53.12  157.75  1984     4  154.33
1984-05  32539  53.12  157.75  1984     5   79.00
1984-06  32539  53.12  157.75  1984     6    0.00
1984-07  32539  53.12  157.75  1984     7    0.00
1984-08  32539  53.12  157.75  1984     8    0.00
1984-09  32539  53.12  157.75  1984     9    0.00
1984-10  32539  53.12  157.75  1984    10    0.00
1984-11  32539  53.12  157.75  1984    11   44.67
1984-12  32539  53.12  157.75  1984    12   76.33
1985-01    NaN    NaN     NaN   NaN   NaN     NaN
1985-02    NaN    NaN     NaN   NaN   NaN     NaN
1985-03    NaN    NaN     NaN   NaN   NaN     NaN
1985-04    NaN    NaN     NaN   NaN   NaN     NaN
1985-05    NaN    NaN     NaN   NaN   NaN     NaN
1985-06    NaN    NaN     NaN   NaN   NaN     NaN
1985-07    NaN    NaN     NaN   NaN   NaN     NaN
1985-08    NaN    NaN     NaN   NaN   NaN     NaN
1985-09    NaN    NaN     NaN   NaN   NaN     NaN
1985-10    NaN    NaN     NaN   NaN   NaN     NaN
1985-11    NaN    NaN     NaN   NaN   NaN     NaN
1985-12    NaN    NaN     NaN   NaN   NaN     NaN
1986-01  32539  53.12  157.75  1986     1  148.33
1986-02  32539  53.12  157.75  1986     2  160.00
1986-03  32539  53.12  157.75  1986     3  178.00
1986-04  32539  53.12  157.75  1986     4  131.00
1986-05  32539  53.12  157.75  1986     5   61.33
1986-06  32539  53.12  157.75  1986     6    0.00
1986-07  32539  53.12  157.75  1986     7    0.00
1986-08  32539  53.12  157.75  1986     8    0.00

print o_d.head(80)
encrypted_message=''
for character in message:
  if character != ' ':
    encrypted_message += encrypt(character)
  else:
    encrypted_message += character
print encrypted_messsage