我有两个带有不相等元素的pandas Dataframe,但它们已编入索引。我希望按索引分割它们,而不必进行插值。
DataFrame1 =
tbr45 tbl45 tbr90 tbl90 \
2013-09-09 11:35:00+00:00 481.205292 458.953156 572.320435 559.995605
2013-09-09 11:36:00+00:00 484.707611 462.304871 573.970215 561.364807
2013-09-09 11:37:00+00:00 488.629181 466.664246 578.624695 564.752808
2013-09-09 11:38:00+00:00 490.437164 468.294403 580.286316 565.774475
2013-09-09 11:39:00+00:00 492.522095 471.054016 582.710510 568.416321
2013-09-09 11:40:00+00:00 494.583923 473.001190 584.202637 571.518433
2013-09-09 11:41:00+00:00 498.174072 477.333557 586.001465 574.513794
DataFrame2 =
tbr45 tbl45 tbr90 tbl90
2013-09-09 11:41:00+00:00 498.174072 477.333557 586.001465 574.513794
2013-09-09 11:42:00+00:00 499.323181 478.827942 587.080750 576.497192
2013-09-09 11:43:00+00:00 502.315674 483.138062 589.863647 579.052368
2013-09-09 11:44:00+00:00 503.036499 484.466675 592.452515 580.705750
2013-09-09 11:45:00+00:00 505.769226 486.743713 595.071167 582.199707
2013-09-09 11:46:00+00:00 507.393738 488.528107 597.469421 583.763977
2013-09-09 11:47:00+00:00 509.901398 491.445221 598.312622 584.742004
2013-09-09 11:48:00+00:00 511.310791 493.962524 600.510742 587.291992
对于这种情况,操作DataFrame2 / DataFrame1显然在11:41:00只有一个元素,结果为1,1,1,1。另一个结果可能是NaN
实际上,我有几天的数据,逐个插值是一个很难的选择。也许使用apply
,但我不知道如何。
答案 0 :(得分:1)
您可以通过以下方式划分数据框:result = df2.divide(df1, axis='index')
import pandas as pd
import numpy as np
import io
temp=u""";tbr45;tbl45;tbr90;tbl90
2013-09-09 11:35:00+00:00;481.205292;458.953156;572.320435;559.995605
2013-09-09 11:36:00+00:00;484.707611;462.304871;573.970215;561.364807
2013-09-09 11:37:00+00:00;488.629181;466.664246;578.624695;564.752808
2013-09-09 11:38:00+00:00;490.437164;468.294403;580.286316;565.774475
2013-09-09 11:39:00+00:00;492.522095;471.054016;582.710510;568.416321
2013-09-09 11:40:00+00:00;494.583923;473.001190;584.202637;571.518433
2013-09-09 11:41:00+00:00;498.174072;477.333557;586.001465;574.513794"""
df1 = pd.read_csv(io.StringIO(temp), sep=";", index_col=[0])
print df1
temp1=u""";tbr45;tbl45;tbr90;tbl90
2013-09-09 11:41:00+00:00;498.174072;477.333557;586.001465;574.513794
2013-09-09 11:42:00+00:00;499.323181;478.827942;587.080750;576.497192
2013-09-09 11:43:00+00:00;502.315674;483.138062;589.863647;579.052368
2013-09-09 11:44:00+00:00;503.036499;484.466675;592.452515;580.705750
2013-09-09 11:45:00+00:00;505.769226;486.743713;595.071167;582.199707
2013-09-09 11:46:00+00:00;507.393738;488.528107;597.469421;583.763977
2013-09-09 11:47:00+00:00;509.901398;491.445221;598.312622;584.742004
2013-09-09 11:48:00+00:00;511.310791;493.962524;600.510742;587.291992"""
df2 = pd.read_csv(io.StringIO(temp1), sep=";", index_col=[0])
print df2
result = df2.divide(df1, axis='index')
print result
# tbr45 tbl45 tbr90 tbl90
#2013-09-09 11:35:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:36:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:37:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:38:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:39:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:40:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:41:00+00:00 1 1 1 1
#2013-09-09 11:42:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:43:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:44:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:45:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:46:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:47:00+00:00 NaN NaN NaN NaN
#2013-09-09 11:48:00+00:00 NaN NaN NaN NaN