我们有数据到达以下结构
entity_id entity_value category_id category_weight group_id group_weight
1 100 11 6 101 4
1 100 11 6 102 3
1 100 12 5 102 3
1 100 12 5 103 2
1 100 13 6 101 4
实体可以属于任何类别和任何组合中的任何组,类别成员资格和组成员资格之间不存在隐含关系。
数据冗余但一致;如果一行表示类别11的权重为6,则所有行都会说类别11的权重为6.这同样适用于群组及其权重。
一行数据由{entity_id, category_id, group_id}
唯一标识。
目的是根据各种权重在所有行中分配实体的值。首先,按类别分配,然后按组分配。
第1步:按类别分摊
实体1与3个类别{11,12,13}相关联,权重为{6,5,4}
将100 *(6 /(6 + 5 + 6))分配给类别11 => 35.29
将100 *(5 /(6 + 5 + 6))分配给类别12 => 29.41
将100 *(6 /(6 + 5 + 6))分配给类别13 => 35.29
第2步:按小组分配这些结果
Entity1Category11与群组{101,102}相关联,权重为{4,3}
将35.29 *(4 /(4 + 3))分配给组101 => 20.17
将35.29 *(3 /(4 + 3))分配给组102 => 15.12Entity1Category12与群组{102,103}相关联,权重为{3,2}
将29.41 *(3 /(3 + 2))分配给组102 => 17.65
将29.41 *(2 /(3 + 2))分配给组103 => 11.76Entity1Category13与权重为{4}
的群组{101}相关联将35.29 *(4 /(4))分配给组103 => 35.29
我可以用窗口函数做第二步。干净整洁,没有自我加入。
然而,第一步似乎需要子查询和自我加入。
例如...... http://sqlfiddle.com/#!18/be890/1
SELECT
sample.entity_id,
sample.category_id,
sample.group_id,
sample.entity_value AS original_value,
sample.entity_value
* (sample.category_weight / entity.total_category_weight)
* (sample.group_weight / SUM(sample.group_weight) OVER (PARTITION BY sample.entity_id, sample.category_id))
AS apportioned_value
FROM
(
SELECT
entity_id,
SUM(category_weight) AS total_category_weight
FROM
(
SELECT
entity_id,
category_id,
MAX(category_weight) AS category_weight
FROM
sample
GROUP BY
entity_id,
category_id
)
entity_category
GROUP BY
entity_id
)
entity
INNER JOIN
sample
ON sample.entity_id = entity.entity_id
是否有更整洁的方式,无需自我加入?
答案 0 :(得分:0)
SELECT
entity_id,
category_id,
group_id,
entity_value AS original_value,
entity_value
* (category_weight / SUM(scaled_cat_weight) OVER (PARTITION BY entity_id ))
* (group_weight / SUM(group_weight ) OVER (PARTITION BY entity_id, category_id))
AS apportioned_value
FROM
(
SELECT
*,
category_weight / COUNT(*) OVER (PARTITION BY entity_id, category_id) AS scaled_cat_weight
FROM
sample
)
scaled
ORDER BY
entity_id,
category_id,
group_id