statsmodel

时间:2017-08-01 18:40:47

标签: python pandas error-handling regression statsmodels

我有一只看起来像这样的熊猫df:

   broker-value-current  broker-value-prior      consensus-after  
                 590.00              510.00              462.55   
                  32.74               31.98               30.72   
                  33.00               30.00               30.04 

           pctch_broker      pctch_consensus    pctch_frstrec_eps 
              15.686275             1.599051             1.421657   
               2.376485             0.195695           -82.098455   
              10.000000             0.805369           -82.098455  

      pctch_frstrec_rev  
               1.243782  
              -1.258936  
              -1.258936 

最后几列创建时使用:

 data['pctch_broker'] = ((data['broker-value-current']-data['broker-value-prior'])/data['broker-value-prior'])*100
 data['pctch_consensus'] = ((data['consensus-after']-data['consensus-before'])/data['consensus-before'])*100
 data['pctch_frstrec_eps'] = ((data['frstrec_eps_announced']-data['frstrec_eps_forecast'])/data['frstrec_eps_forecast'])*100
 data['pctch_frstrec_rev'] = ((data['frstrec_rev_announced']-data['frstrec_rev_forecast'])/data['frstrec_rev_forecast'])*100

我也用这一行清楚了NA:

cleaned_data = data.dropna()

使用scipy统计数据时:

 import statsmodels.formula.api as sm

然而,当我尝试使用'pctch_frstrec_rev'或'pctch_frstrec_eps'作为具有此代码的因变量将'pctch_consensus'或'pctch_broker'作为自变量回归:

 reg1 = sm.ols(formula="pctch_consensus ~ pctch_frstrec_rev", data=cleaned_data).fit()

我收到此错误:

RuntimeWarning: invalid value encountered in greater return (S > tol).sum(axis=-1)

1 个答案:

答案 0 :(得分:0)

发生此问题的原因是您的数据框中存在无穷大。在创建新变量时,您可能通过除以零来创建这些无穷大。

这应该解决它:

cleaned_data = data.replace([np.inf, -np.inf], np.nan)