如何计算重量以最小化方差?

时间:2017-07-08 08:24:49

标签: python matlab optimization variance convex-optimization

给出了几个向量:

x1 = [3 4 6]
x2 = [2 8 1]
x3 = [5 5 4]
x4 = [6 2 1]

我想找到每个项目的权重w1,w2,w3,并得到每个向量的加权和:yi = w1*i1 + w2*i2 + w3*i3。例如,y1 = 3*w1 + 4*w2 + 6*w3 使这些值(y1,y2,y3,y4)的方差最小化。

注意:w1,w2,w3应该> 0和w1 + w2 + w3 = 1

我不知道它应该是什么样的问题......以及如何在python或matlab中解决它?

4 个答案:

答案 0 :(得分:1)

您可以从构建损失函数开始,说明w的方差和约束。平均值为m = (1/4)*(y1 + y2 + y3 + y4)。然后方差为(1/4)*((y1-m)^2 + (y2-m)^2 + (y3-m)^2 + (y4-m)^2),约束为a*(w1+w2+w3 - 1),其中a为拉格朗日乘数。这个问题看起来像是一个带有凸约束的凸优化,因为损失函数相对于目标变量(w1,w2,w3)是二次的,并且约束是线性的。您可以查找与所提供的约束相关的投影梯度下降算法。看看这里http://www.ifp.illinois.edu/~angelia/L5_exist_optimality.pdf一般来说,对这类问题没有直接的解析解决方案。

答案 1 :(得分:0)

def

我想我可能会得到你问题的目的。但是如果你想找到最小的值,我希望这可以帮助你。 我只是修改了值,当你看到这是解决问题的一种方法时,你可以使它成为WITH temp1 AS ( SELECT [GMI Division#], [GMI Division],[Type_Mapping], [Base Product#], SUM([BUoM_Delivery Quantity])*([dbo].[CF_VEDW_PRD_MATL_DIM_N].[EQC_FCTR]) AS Delv_EQC_Fctr, (SUM([BUoM_GMI Order Cut Total]) + SUM([BUoM_GMI Delivery Cut Total])) * ([dbo].[CF_VEDW_PRD_MATL_DIM_N].[EQC_FCTR]) AS Cuts_EQC_Fctr FROM [dbo].[CF_Weekly_Casefill_Final_TY] LEFT JOIN [dbo].[CF_VEDW_PRD_MATL_DIM_N] ON [dbo].[CF_Weekly_Casefill_Final_TY].[Base Product#]=cast([dbo].[CF_VEDW_PRD_MATL_DIM_N].[BASPRD_NBR] as bigint) GROUP BY [GMI Division#], [GMI Division], [Type_Mapping], [Base Product#], [EQC_FCTR], [BUoM_Delivery Quantity], [BUoM_GMI Order Cut Total], [BUoM_GMI Delivery Cut Total] ) SELECT [GMI Division#],[GMI Division], [Type_Mapping],Sum(Delv_EQC_Fctr) AS Delv_EQC_Fctr, Sum(Cuts_EQC_Fctr) AS Cuts_EQC_Fctr FROM temp1 WHERE (([GMI Division#] in (01,03,12,16,50) and ([Distribution channel] IN ('RT','ML')) OR ([GMI Division#] in (08)) AND [Distribution channel] IN ('FS'))) AND right([Fiscal Year/Period],4) IN (2018) GROUP BY [GMI Division#], [GMI Division], [Type_Mapping], [EQC_FCTR]

答案 2 :(得分:0)

我的完整解决方案可以是viewed in PDF

诀窍是将向量x_i作为矩阵X的列 然后编写问题变成了一个凸面问题,解决方案是单元单面上的。

我使用Projected Sub Gradient Method解决了它 我计算了目标函数的渐变,并创建了Unit Simplex的投影。

现在所有需要的是迭代它们 我使用CVX验证了我的解决方案。

% StackOverflow 44984132
% How to calculate weight to minimize variance?
% Remarks:
%   1.  sa
% TODO:
%   1.  ds
% Release Notes
% - 1.0.000     08/07/2017
%   *   First release.


%% General Parameters

run('InitScript.m');

figureIdx           = 0; %<! Continue from Question 1
figureCounterSpec   = '%04d';

generateFigures = OFF;


%% Simulation Parameters

dimOrder    = 3;
numSamples = 4;

mX = randi([1, 10], [dimOrder, numSamples]);
vE = ones([dimOrder, 1]);


%% Solve Using CVX

cvx_begin('quiet')
    cvx_precision('best');
    variable vW(numSamples)
    minimize( (0.5 * sum_square_abs( mX * vW - (1 / numSamples) * (vE.' * mX * vW) * vE )) )
    subject to
        sum(vW) == 1;
        vW >= 0;
cvx_end

disp([' ']);
disp(['CVX Solution -                       [ ', num2str(vW.'), ' ]']);


%% Solve Using Projected Sub Gradient

numIterations   = 20000;
stepSize        = 0.001;
simplexRadius   = 1; %<! Unit Simplex Radius
stopThr         = 1e-6;

hKernelFun  = @(vW) ((mX * vW) - ((1 / numSamples) * ((vE.' * mX * vW) * vE)));
hObjFun     = @(vW) 0.5 * sum(hKernelFun(vW) .^ 2);
hGradFun    = @(vW) (mX.' * hKernelFun(vW)) - ((1 / numSamples) * vE.' * (hKernelFun(vW)) * mX.' * vE);

vW = rand([numSamples, 1]);
vW = vW(:) / sum(vW);

for ii = 1:numIterations
    vGradW = hGradFun(vW);
    vW = vW - (stepSize * vGradW);

    % Projecting onto the Unit Simplex
    % sum(vW) == 1, vW >= 0.
    vW = ProjectSimplex(vW, simplexRadius, stopThr);
end

disp([' ']);
disp(['Projected Sub Gradient Solution -    [ ', num2str(vW.'), ' ]']);


%% Restore Defaults

% set(0, 'DefaultFigureWindowStyle', 'normal');
% set(0, 'DefaultAxesLooseInset', defaultLoosInset);

您可以在StackOverflow Q44984132中看到完整的代码(PDF也可用)。

答案 3 :(得分:0)

我对优化问题了解不多,但我得到了梯度下降的想法,所以我试图减少最高分和最低分之间的权重,我的脚本如下:

# coding: utf-8
import numpy as np
#7.72
#7.6
#8.26

def get_max(alist):
    max_score = max(alist)
    idx = alist.index(max_score)
    return max_score, idx

def get_min(alist):
    max_score = min(alist)
    idx = alist.index(max_score)
    return max_score, idx

def get_weighted(alist,aweight):
    res = []
    for i in range(0, len(alist)):
        res.append(alist[i]*aweight[i])
    return res

def get_sub(list1, list2):
    res = []
    for i in range(0, len(list1)):
        res.append(list1[i] - list2[i])
    return res

def grad_dec(w,dist, st = 0.001):
    max_item, max_item_idx = get_max(dist)
    min_item, min_item_idx = get_min(dist)
    w[max_item_idx] = w[max_item_idx] - st
    w[min_item_idx] = w[min_item_idx] + st

def cal_score(w, x):
    score = []
    print 'weight', w ,x
    for i in range(0, len(x)):
        score_i = 0
        for j in range(0,5):
            score_i = w[j]*x[i][j] + score_i
        score.append(score_i)
    # check variance is small enough
    print 'score', score
    return score

    # cal_score(w,x)

if __name__ == "__main__":
    init_w = [0.2, 0.2, 0.2, 0.2, 0.2, 0.2]
    x = [[7.3, 10, 8.3, 8.8, 4.2], [6.8, 8.9, 8.4, 9.7, 4.2], [6.9, 9.9, 9.7, 8.1, 6.7]]
    score = cal_score(init_w,x)
    variance = np.var(score)
    round = 0
    for round in range(0, 100):
        if variance < 0.012:
            print 'ok'
            break
        max_score, idx = get_max(score)
        min_score, idx2 = get_min(score)
        weighted_1 = get_weighted(x[idx], init_w)
        weighted_2 = get_weighted(x[idx2], init_w)
        dist = get_sub(weighted_1, weighted_2)
        # print max_score, idx, min_score, idx2, dist
        grad_dec(init_w, dist)
        score = cal_score(init_w, x)
        variance = np.var(score)
        print 'variance', variance

    print score

在我的练习中,它确实可以减少差异。我很高兴,但我不知道我的解决方案是否在数学上是可靠的。