按组计算行的绝对差值之和

时间:2013-05-13 03:17:56

标签: mysql query-help

我有一个问题,我已经能够使用Stata解决,但现在我的数据已经增长到我无法再从内存处理它的大小。我希望在MySQL中这样做。 我正在尝试计算 n 组之间项目的曼哈顿距离。到目前为止,我已经操纵了数据,我希望它已准备好为计算工作:

SELECT * FROM exampleshares;

+----------+-------------+-------------+
| item     | group       | share       |
+----------+-------------+-------------+
| A        | group1      |  .3         |
| B        | group1      |  .7         |
| A        | group2      |  .2         |
| B        | group2      |  .6         |
| C        | group2      |  .2         |
| A        | group3      |  .3         |
| C        | group3      |  .6         |
+----------+-------------+-------------+

这个例子的曼哈顿距离将是:

+----------+-------------+-------------+
| groupX   | groupY      | M distance  |
+----------+-------------+-------------+
| group1   | group1      | 0           |
| group1   | group2      |  .4         |
| group1   | group3      | 1.3         |
| group2   | group1      |  .4         |
| group2   | group2      | 0           |
| group2   | group3      | 1.1         |
| group3   | group1      | 1.3         |
| group3   | group2      | 1.1         |
| group3   | group3      | 0           |
+----------+-------------+-------------+

例如,group1和group2之间的距离计算为| .3-.2 | + | .7-.6 | + | 0-.2 | = 0.4,即。股票绝对差额的总和。 我如何在MySQL中执行此操作?

在我的搜索过程中,如果找到了几个解决方案来计算与前一个row by group的差异,但没有找到我想要的具体内容。

1 个答案:

答案 0 :(得分:0)

我相信您必须使用存储的例程或其他脚本才能完成此任务。这是一个存储的例程:

delimiter //
drop procedure if exists manhattanDistance//
create procedure manhattanDistance (in startGroup char(32), in endGroup char(32), out manhattanDistance decimal(2,1))
    not deterministic
    reads sql data
begin
  drop table if exists tmp_items;
  create temporary table tmp_items as select distinct item from exampleshares;

  select sum(abs(ifnull(es1.share, 0) - ifnull(es2.share, 0))) into manhattanDistance
    from tmp_items ti
    left join exampleshares es1 on es1.item = ti.item and es1.group = startGroup
    left join exampleshares es2 on es2.item = ti.item and es2.group = endGroup;
end//
delimiter ;

call manhattanDistance('group1', 'group2', @distanceBetweenGroup1And2);
select @distanceBetweenGroup1And2;