Question

我在Matlab编写代码有这个问题，希望有人知道如何解决这个问题。

总结：问题是我有几个不同的金融投资组合（1070个投资组合），我需要对每个投资组合进行回归。然后使用第一次回归的残差，我想引导那些残差（获取大约1000个自举残差样本），但是每个单独的投资组合。这是因为我无法混合来自不同投资组合的残差。

详细信息：我有一个向量告诉我投资组合编号，这是一个随机数，但对于该特定投资组合是唯一的。然后我将投资组合回报收集在一个长向量中（14k观察），所以我需要做的是某种“滚动窗口”OLS回归并且只回归对应于单个投资组合的数据，提取常量和beta和保存这些，然后为所有不同的投资组合执行此操作。

我会得到一个由常数和beta组成的矩阵，然后每一行对应一个特定的投资组合。

投资组合具有不同数量的数据点，因此一个投资组合可能有60个观测值，而另一个投资组合可能有150个观测值。因此，不可能只是按固定的时间间隔将其拆分为单独的投资组合。

对于自举残差，如上所述，我需要从投资组合的残差中抽取，而不是整个样本。我需要这些自举样本进行进一步的数据操作，但是我有1000个自举样本，其余的只是正常的加法和减法操作......

有谁知道怎么做？在Stata中，对于回归部分，您只需使用“by（）”选项，但对于自举，它并不那么容易......

我非常感谢任何帮助！

最好的问候，菲利普

Answer 1

我使用MATLAB附加了几个不同的残余引导示例

以下参考资料提供了一些很好的背景信息

http://www.economics.uci.edu/~dbrownst/bootmi.pdf

%% Generate a data set and perform your initial regression

clear all
clc

X = linspace(1, 1000, 1000)';
Y = 3*X + 5 + randn(1000,1);
X = [ones(length(X),1), X];

[b,bint,r,rint,stats] = regress(Y,X);

%% Parametric residual bootstrap

% This method assumes that your residuals are normally distributed and
% homoskedastic.  A parametric residual bootstrap is more accurate and
% converges faster than a nonparametric residual bootstrap.

% Use a one sided kstest to verify that the residuals are normally
% distributed.
kstest(r)

% Create a probability distribution object that models the residuals
foo = fitdist(r, 'normal')

% Create an array to store your new datasets
Nboot = 500;
My_Data = zeros(length(X), Nboot);

% Create your data sets

YHat = X*b;

for i = 1:Nboot

    boot_noise = random(foo, 1000,1);
    My_Data(:,i) = YHat + boot_noise;

end

% Use your data to do something interesting

%%  Nonparametric residual bootstrap

% This method assumes that your errors are homoskedastic, however, there is
% no assume that they are normally distributed.

% This method also uses the bootstrp function from Statistics Toolbox and
% assumes that we're using the bootstrp to estimate standard errors

standard_errors = std(bootstrp(1000,@(bootr)regress(YHat+bootr, X),r))

Bootstrap采样取决于产品组合（MATLAB）

1 个答案: