我有一个包含101个模拟的数据库,比方说,5个不同的资产类别返回。
我需要编写一个查询来计算5个类中每个类之间的相应相关性。表格如下所示:
AssetClass_ID |模拟| AssetClass_Value
有什么想法吗?我很难接近。
(根据难度,我可能最终不得不告诉最终用户只需下载所有模拟并使用内置的EXCEL功能进行统计,但我不太可能这样做受欢迎)
答案 0 :(得分:1)
好的,有一些谷歌和一些工作,我想出了:
SELECT
AssetID_1, AssetID_2,
((psum - (sum1 * sum2 / n)) / sqrt((sum1sq - sum1*sum1 / n) * (sum2sq - sum2*sum2 / n))) AS [Correlation Coefficient],
n
FROM
(SELECT
n1.AssetClass_ID AS AssetID_1,
n2.AssetClass_ID AS AssetID_2,
SUM(n1.RunResults_Value) AS sum1,
SUM(n2.RunResults_Value) AS sum2,
SUM(n1.RunResults_Value * n1.RunResults_Value) AS sum1sq,
SUM(n2.RunResults_Value * n2.RunResults_Value) AS sum2sq,
SUM(n1.RunResults_Value * n2.RunResults_Value) AS psum,
COUNT(*) AS n
FROM
dbo.tbl_RunResults AS n1
LEFT JOIN dbo.tbl_RunResults AS n2 ON n1.Simulation_ID = n2.Simulation_ID
WHERE
n1.AssetClass_ID < n2.AssetClass_ID AND
n1.series_ID = 2332 AND
n2.series_ID = 2332
GROUP BY
n1.AssetClass_ID, n2.AssetClass_ID) AS step1
ORDER BY
AssetID_1
到目前为止,答案与Excel内置函数相匹配,非常好。