Python Interpolate每月值到每日值(线性):熊猫

时间:2018-02-01 11:52:27

标签: python pandas timestamp time-series interpolation

我有一个pandas数据框,其中包含年月数据列(yyyymm)。我计划将插值数据发送到每日&每周价值。这是我的df。

df:
    201301      201302      201303      ...    201709       201710  
a   0.747711    0.793101    0.771819    ...    0.818161     0.812522 
b   0.776537    0.759745    0.733673    ...    0.757496     0.765181 
c   0.801699    0.847655    0.796586    ...    0.784537     0.763551 
d   0.797942    0.687899    0.729911    ...    0.819887     0.772395 
e   0.777472    0.799676    0.782947    ...    0.804533     0.791759 
f   0.780933    0.750774    0.781056    ...    0.790846     0.773705 
g   2.071699    2.261739    2.126915    ...    1.891780     2.098914 

正如你可以看到我的df是montly列数据,我希望将其更改为每日值。我打算使用线性函数。这是一个例子。

# (201302 - 201301)/31 (since January 2013 has 31 days)
a = (0.793101-0.747711)/31
# now a is the daily increasing (or decresing depends on values) value for a day.
# 2013-01-01 value woud be
0.747711
# 2013-01-02 value woud be
0.747711 + a
# 2013-01-03 value woud be
0.747711 + (a*2)
# last day of January would be
0.747711 + (a*30)
# first day of Feb would be
0.747711 + (a*31) which is 0.793101 (201302 value)

所以我的df_daily每天都会从2013年到2017年10月(第一天),价值就像上面那样。我一周都在使用时间戳,所以如果有任何方法可以插入我的值,从月到日值,那就太棒了。谢谢!

哦,如果我的问题令人困惑,请告诉我。

1 个答案:

答案 0 :(得分:3)

首先按to_datetime将列转换为datetimes,然后将reindex转换为NaN个缺失日期和最后interpolate

df.columns = pd.to_datetime(df.columns, format='%Y%m')
#by first and last values of columns
rng = pd.date_range(df.columns[0], df.columns[-1])
#alternatively min and max of columns
#rng = pd.date_range(df.columns.min(), df.columns.max())

df = df.reindex(rng, axis=1).interpolate(axis=1)

验证解决方案:

a = (0.793101-0.747711)/31
print (0.747711 + a)
print (0.747711 + a*2)
print (0.747711 + a*3)
0.7491751935483871
0.7506393870967742
0.7521035806451613
print (df)
  2013-01-01  2013-01-02  2013-01-03  2013-01-04  2013-01-05  2013-01-06  \
a    0.747711    0.749175    0.750639    0.752104    0.753568    0.755032   
b    0.776537    0.775995    0.775454    0.774912    0.774370    0.773829   
c    0.801699    0.803181    0.804664    0.806146    0.807629    0.809111   
d    0.797942    0.794392    0.790842    0.787293    0.783743    0.780193   
e    0.777472    0.778188    0.778905    0.779621    0.780337    0.781053   
f    0.780933    0.779960    0.778987    0.778014    0.777042    0.776069   
g    2.071699    2.077829    2.083960    2.090090    2.096220    2.102351   

   2013-01-07  2013-01-08  2013-01-09  2013-01-10     ...      2017-09-22  \
a    0.756496    0.757960    0.759425    0.760889     ...        0.814214   
b    0.773287    0.772745    0.772204    0.771662     ...        0.762876   
c    0.810594    0.812076    0.813559    0.815041     ...        0.769847   
d    0.776643    0.773094    0.769544    0.765994     ...        0.786643   
e    0.781770    0.782486    0.783202    0.783918     ...        0.795591   
f    0.775096    0.774123    0.773150    0.772177     ...        0.778847   
g    2.108481    2.114611    2.120742    2.126872     ...        2.036774   

   2017-09-23  2017-09-24  2017-09-25  2017-09-26  2017-09-27  2017-09-28  \
a    0.814026    0.813838    0.813650    0.813462    0.813274    0.813086   
b    0.763132    0.763388    0.763644    0.763900    0.764156    0.764413   
c    0.769147    0.768448    0.767748    0.767049    0.766349    0.765650   
d    0.785060    0.783476    0.781893    0.780310    0.778727    0.777144   
e    0.795165    0.794740    0.794314    0.793888    0.793462    0.793036   
f    0.778276    0.777705    0.777133    0.776562    0.775990    0.775419   
g    2.043678    2.050583    2.057487    2.064392    2.071296    2.078201   

   2017-09-29  2017-09-30  2017-10-01  
a    0.812898    0.812710    0.812522  
b    0.764669    0.764925    0.765181  
c    0.764950    0.764251    0.763551  
d    0.775561    0.773978    0.772395  
e    0.792611    0.792185    0.791759  
f    0.774848    0.774276    0.773705  
g    2.085105    2.092010    2.098914  

[7 rows x 1735 columns]