Question

我正在运行SUR回归，并将80个不同的银行的收益作为因变量。自变量始终相同。如果需要回答我的问题，您应该能够使用下面的代码重新创建回归：

如何测试样本中所有80个银行的“事件”系数的平均值是否等于零？

我该如何测试样本中的子组（例如前20个银行和最后40个银行）对于“事件”的平均系数是否相似，或者它们之间是否存在显着差异？

from google.cloud import storage
from zipfile import ZipFile
from zipfile import is_zipfile
import io

def zipextract(bucketname, zipfilename_with_path):

    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucketname)

    destination_blob_pathname = zipfilename_with_path

    blob = bucket.blob(destination_blob_pathname)
    zipbytes = io.BytesIO(blob.download_as_string())

    if is_zipfile(zipbytes):
        with ZipFile(zipbytes, 'r') as myzip:
            for contentfilename in myzip.namelist():
                contentfile = myzip.read(contentfilename)
                blob = bucket.blob(zipfilename_with_path + "/" + contentfilename)
                blob.upload_from_string(contentfile)

zipextract("mybucket", "path/file.zip") # if the file is gs://mybucket/path/file.zip

感谢您的帮助。请让我知道是否缺少什么东西！

Answer 1

为此，我们可以直接使用linearHypothesis（请参阅?linearHypothesis.systemfit）。在第一种情况下，我们有

coefs <- coef(cyp3sur)
R1 <- matrix(0, nrow = 1, ncol = length(coefs))
R1[1, grep("Intercept", names(coefs))] <- 1
linearHypothesis(cyp3sur, R1)

其中R1有一行，因为有一个约束。请注意，我将系数1而不是1/80相加，因为它们是等效的（X + Y = 0与（X + Y）/ 2 = 0相同）。使用grep可让我找到截距的位置。

类似地，在第二种情况下，我们有

R2 <- matrix(0, nrow = 1, ncol = length(coefs))
gr1 <- paste0("X", 1:20, "_Event")
gr2 <- paste0("X", 41:80, "_Event")
R2[1, names(coefs) %in% gr1] <- 1 / 20
R2[1, names(coefs) %in% gr2] <- -1 / 40
linearHypothesis(cyp3sur, R2)

现在，我用paste0构造感兴趣的变量名，并使用%in%确定它们在coefs中的位置。

SUR回归：测试系数的均值是否等于零

1 个答案: