pandas datetime-indexed DataFrames与不相等元素之间的操作

时间:2015-11-05 15:13:21

标签: python pandas indexing dataframe operation

我有两个带有不相等元素的pandas Dataframe,但它们已编入索引。我希望按索引分割它们,而不必进行插值。

DataFrame1 =

                                tbr45       tbl45       tbr90       tbl90  \
2013-09-09 11:35:00+00:00  481.205292  458.953156  572.320435  559.995605   
2013-09-09 11:36:00+00:00  484.707611  462.304871  573.970215  561.364807   
2013-09-09 11:37:00+00:00  488.629181  466.664246  578.624695  564.752808   
2013-09-09 11:38:00+00:00  490.437164  468.294403  580.286316  565.774475   
2013-09-09 11:39:00+00:00  492.522095  471.054016  582.710510  568.416321   
2013-09-09 11:40:00+00:00  494.583923  473.001190  584.202637  571.518433   
2013-09-09 11:41:00+00:00  498.174072  477.333557  586.001465  574.513794   

DataFrame2 =

                                tbr45       tbl45       tbr90       tbl90    
2013-09-09 11:41:00+00:00  498.174072  477.333557  586.001465  574.513794   
2013-09-09 11:42:00+00:00  499.323181  478.827942  587.080750  576.497192   
2013-09-09 11:43:00+00:00  502.315674  483.138062  589.863647  579.052368   
2013-09-09 11:44:00+00:00  503.036499  484.466675  592.452515  580.705750   
2013-09-09 11:45:00+00:00  505.769226  486.743713  595.071167  582.199707   
2013-09-09 11:46:00+00:00  507.393738  488.528107  597.469421  583.763977   
2013-09-09 11:47:00+00:00  509.901398  491.445221  598.312622  584.742004   
2013-09-09 11:48:00+00:00  511.310791  493.962524  600.510742  587.291992

对于这种情况,操作DataFrame2 / DataFrame1显然在11:41:00只有一个元素,结果为1,1,1,1。另一个结果可能是NaN

实际上,我有几天的数据,逐个插值是一个很难的选择。也许使用apply,但我不知道如何。

1 个答案:

答案 0 :(得分:1)

您可以通过以下方式划分数据框:result = df2.divide(df1, axis='index')

import pandas as pd
import numpy as np
import io

temp=u""";tbr45;tbl45;tbr90;tbl90
2013-09-09 11:35:00+00:00;481.205292;458.953156;572.320435;559.995605
2013-09-09 11:36:00+00:00;484.707611;462.304871;573.970215;561.364807
2013-09-09 11:37:00+00:00;488.629181;466.664246;578.624695;564.752808
2013-09-09 11:38:00+00:00;490.437164;468.294403;580.286316;565.774475
2013-09-09 11:39:00+00:00;492.522095;471.054016;582.710510;568.416321
2013-09-09 11:40:00+00:00;494.583923;473.001190;584.202637;571.518433
2013-09-09 11:41:00+00:00;498.174072;477.333557;586.001465;574.513794"""

df1 = pd.read_csv(io.StringIO(temp), sep=";", index_col=[0])
print df1

temp1=u""";tbr45;tbl45;tbr90;tbl90
2013-09-09 11:41:00+00:00;498.174072;477.333557;586.001465;574.513794
2013-09-09 11:42:00+00:00;499.323181;478.827942;587.080750;576.497192
2013-09-09 11:43:00+00:00;502.315674;483.138062;589.863647;579.052368
2013-09-09 11:44:00+00:00;503.036499;484.466675;592.452515;580.705750
2013-09-09 11:45:00+00:00;505.769226;486.743713;595.071167;582.199707
2013-09-09 11:46:00+00:00;507.393738;488.528107;597.469421;583.763977
2013-09-09 11:47:00+00:00;509.901398;491.445221;598.312622;584.742004
2013-09-09 11:48:00+00:00;511.310791;493.962524;600.510742;587.291992"""

df2 = pd.read_csv(io.StringIO(temp1), sep=";", index_col=[0])
print df2

result = df2.divide(df1, axis='index')
print result
#                           tbr45  tbl45  tbr90  tbl90
#2013-09-09 11:35:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:36:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:37:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:38:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:39:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:40:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:41:00+00:00      1      1      1      1
#2013-09-09 11:42:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:43:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:44:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:45:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:46:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:47:00+00:00    NaN    NaN    NaN    NaN
#2013-09-09 11:48:00+00:00    NaN    NaN    NaN    NaN