numpy.polyfit没有关键字'cov'

时间:2012-04-21 18:55:04

标签: python numpy covariance

我正在尝试使用polyfit来找到一组数据的最佳拟合直线,但我还需要知道参数的不确定性,所以我也想要协方差矩阵。在线文档建议我写道:

  
    

polyfit(x,y,2,cov = True)

  

但这会产生错误:

  
    

TypeError:polyfit()得到了一个意外的关键字参数'cov'

  

足够的帮助(polyfit)显示没有关键字参数'cov'。

在线文档是否也参考了之前发布的numpy? (我有1.6.1,最新的一个)。我可以编写自己的polyfit函数,但有没有人对我为什么在我的polyfit上没有协方差选项有任何建议?

由于

1 个答案:

答案 0 :(得分:1)

对于来自库的解决方案,我发现使用scikits.statsmodels是一个方便的选择。在statsmodels中,回归对象具有可调用的属性,这些属性返回参数和标准错误。我在下面举例说明了这对你有用:

# Imports, I assume NumPy for forming your data.
import numpy as np
import scikits.statsmodels.api as sm

# Form the data here
(X, Y) = ....

reg_x_data =  np.ones(X.shape);                      # 0th degree term. 
for ii in range(1,deg+1):
    reg_x_data = np.hstack(( reg_x_data, X**(ii) )); # Append the ii^th degree term.


# Store OLS regression results into `result`
result = sm.OLS(Y,reg_x_data).fit()


# Print the estimated coefficients
print result.params

# Print the basic OLS standard error in the coefficients
print result.bse

# Print the estimated basic OLS covariance matrix
print result.cov_params()   # <-- Notice, this one is a function by convention.

# Print the heteroskedasticity-consistent standard error
print result.HC0_se

# Print the heteroskedasticity-consistent covariance matrix
print result.cov_HC0

result对象中还有其他强大的协方差属性。您可以通过打印dir(result)来查看它们。此外,按照惯例,异方差性一致性估计量的协方差矩阵只有 之后才能调用相应的标准误差,例如:您必须在{{1}之前调用result.HC0_se因为第一个引用导致第二个引用被计算和存储。

Pandas是另一个可能为这些操作提供更高级支持的库。

非图书馆功能

当您不希望/不能依赖额外的库函数时,这可能很有用。

下面是我写的一个函数,用于返回OLS回归系数,以及一堆东西。它返回残差,回归方差和标准误差(残差平方的标准误差),大样本方差的渐近公式,OLS协方差矩阵,异方差性一致性“鲁棒”协方差估计(即OLS协方差)但是根据残差加权,以及“白色”或“偏差校正”的异方差性一致性协方差。

result.cov_HC0

对于您的问题,您可以像这样使用它:

import numpy as np

###
# Regression and standard error estimation functions
###
def ols_linreg(X, Y):
    """ ols_linreg(X,Y) 

        Ordinary least squares regression estimator given explanatory variables 
        matrix X and observations matrix Y.The length of the first dimension of 
        X and Y must be the same (equal to the number of samples in the data set).

        Note: these methods should be adapted if you need to use this for large data.
        This is mostly for illustrating what to do for calculating the different
        classicial standard errors. You would never really want to compute the inverse
        matrices for large problems.

        This was developed with NumPy 1.5.1.
    """
    (N, K) = X.shape
    t1 = np.linalg.inv( (np.transpose(X)).dot(X) )
    t2 = (np.transpose(X)).dot(Y)

    beta = t1.dot(t2)
    residuals = Y - X.dot(beta)
    sig_hat = (1.0/(N-K))*np.sum(residuals**2)
    sig_hat_asymptotic_variance = 2*sig_hat**2/N
    conv_st_err = np.sqrt(sig_hat)

    sum1 = 0.0
    for ii in range(N):
        sum1 = sum1 + np.outer(X[ii,:],X[ii,:])

    sum1 = (1.0/N)*sum1
    ols_cov = (sig_hat/N)*np.linalg.inv(sum1)

    PX = X.dot(  np.linalg.inv(np.transpose(X).dot(X)).dot(np.transpose(X))   )
    robust_se_mat1 = np.linalg.inv(np.transpose(X).dot(X))
    robust_se_mat2 = np.transpose(X).dot(np.diag(residuals[:,0]**(2.0)).dot(X))
    robust_se_mat3 = np.transpose(X).dot(np.diag(residuals[:,0]**(2.0)/(1.0-np.diag(PX))).dot(X))

    v_robust = robust_se_mat1.dot(robust_se_mat2.dot(robust_se_mat1))
    v_modified_robust = robust_se_mat1.dot(robust_se_mat3.dot(robust_se_mat1))

    """ Returns:
        beta -- The vector of coefficient estimates, ordered on the columns on X.
        residuals -- The vector of residuals, Y - X.beta
        sig_hat -- The sample variance of the residuals.
        conv_st_error -- The 'standard error of the regression', sqrt(sig_hat).
        sig_hat_asymptotic_variance -- The analytic formula for the large sample variance
        ols_cov -- The covariance matrix under the basic OLS assumptions.
        v_robust -- The "robust" covariance matrix, weighted to account for the residuals and heteroskedasticity.
        v_modified_robust -- The bias-corrected and heteroskedasticity-consistent covariance matrix.
    """
    return beta, residuals, sig_hat, conv_st_err, sig_hat_asymptotic_variance, ols_cov, v_robust, v_modified_robust

如果你发现我的计算中存在任何错误(特别是异方差性一致的估算器),请告诉我,我会尽快解决。