Question

我们有数据到达以下结构

entity_id   entity_value   category_id   category_weight   group_id   group_weight
    1            100            11               6            101          4
    1            100            11               6            102          3
    1            100            12               5            102          3
    1            100            12               5            103          2
    1            100            13               6            101          4

实体可以属于任何类别和任何组合中的任何组，类别成员资格和组成员资格之间不存在隐含关系。

数据冗余但一致;如果一行表示类别11的权重为6，则所有行都会说类别11的权重为6.这同样适用于群组及其权重。

一行数据由{entity_id, category_id, group_id}唯一标识。

目的是根据各种权重在所有行中分配实体的值。首先，按类别分配，然后按组分配。

第1步：按类别分摊

实体1与3个类别{11,12,13}相关联，权重为{6,5,4}


将100 *（6 /（6 + 5 + 6））分配给类别11 =＆gt; 35.29
    将100 *（5 /（6 + 5 + 6））分配给类别12 =＆gt; 29.41
    将100 *（6 /（6 + 5 + 6））分配给类别13 =＆gt; 35.29

第2步：按小组分配这些结果

Entity1Category11与群组{101,102}相关联，权重为{4,3}


将35.29 *（4 /（4 + 3））分配给组101 =＆gt; 20.17
    将35.29 *（3 /（4 + 3））分配给组102 =＆gt; 15.12


Entity1Category12与群组{102,103}相关联，权重为{3,2}


将29.41 *（3 /（3 + 2））分配给组102 =＆gt; 17.65
    将29.41 *（2 /（3 + 2））分配给组103 =＆gt; 11.76


Entity1Category13与权重为{4}
的群组{101}相关联

将35.29 *（4 /（4））分配给组103 =＆gt; 35.29

我可以用窗口函数做第二步。干净整洁，没有自我加入。

然而，第一步似乎需要子查询和自我加入。

例如...... http://sqlfiddle.com/#!18/be890/1

SELECT
  sample.entity_id,
  sample.category_id,
  sample.group_id,
  sample.entity_value   AS original_value,
  sample.entity_value
  * (sample.category_weight / entity.total_category_weight)
  * (sample.group_weight    / SUM(sample.group_weight) OVER (PARTITION BY sample.entity_id, sample.category_id))
    AS apportioned_value
FROM
(
  SELECT
    entity_id,
    SUM(category_weight)   AS total_category_weight
  FROM
  (
    SELECT
      entity_id,
      category_id,
      MAX(category_weight)   AS category_weight
    FROM
      sample
    GROUP BY
      entity_id,
      category_id
  )
    entity_category
  GROUP BY
    entity_id
)
  entity
INNER JOIN
  sample
    ON sample.entity_id = entity.entity_id

是否有更整洁的方式，无需自我加入？

Answer 1

SELECT
  entity_id,
  category_id,
  group_id,
  entity_value   AS original_value,
  entity_value
  * (category_weight / SUM(scaled_cat_weight) OVER (PARTITION BY entity_id             ))
  * (group_weight    / SUM(group_weight     ) OVER (PARTITION BY entity_id, category_id))
    AS apportioned_value
FROM
(
  SELECT
    *,
    category_weight / COUNT(*) OVER (PARTITION BY entity_id, category_id)   AS scaled_cat_weight
  FROM
    sample
)
  scaled
ORDER BY
  entity_id,
  category_id,
  group_id

学术之谜：在没有自我加入的情况下推导出比例

1 个答案: