我有一个问题,我已经能够使用Stata解决,但现在我的数据已经增长到我无法再从内存处理它的大小。我希望在MySQL中这样做。 我正在尝试计算 n 组之间项目的曼哈顿距离。到目前为止,我已经操纵了数据,我希望它已准备好为计算工作:
SELECT * FROM exampleshares;
+----------+-------------+-------------+
| item | group | share |
+----------+-------------+-------------+
| A | group1 | .3 |
| B | group1 | .7 |
| A | group2 | .2 |
| B | group2 | .6 |
| C | group2 | .2 |
| A | group3 | .3 |
| C | group3 | .6 |
+----------+-------------+-------------+
这个例子的曼哈顿距离将是:
+----------+-------------+-------------+
| groupX | groupY | M distance |
+----------+-------------+-------------+
| group1 | group1 | 0 |
| group1 | group2 | .4 |
| group1 | group3 | 1.3 |
| group2 | group1 | .4 |
| group2 | group2 | 0 |
| group2 | group3 | 1.1 |
| group3 | group1 | 1.3 |
| group3 | group2 | 1.1 |
| group3 | group3 | 0 |
+----------+-------------+-------------+
例如,group1和group2之间的距离计算为| .3-.2 | + | .7-.6 | + | 0-.2 | = 0.4,即。股票绝对差额的总和。 我如何在MySQL中执行此操作?
在我的搜索过程中,如果找到了几个解决方案来计算与前一个row by group的差异,但没有找到我想要的具体内容。
答案 0 :(得分:0)
我相信您必须使用存储的例程或其他脚本才能完成此任务。这是一个存储的例程:
delimiter //
drop procedure if exists manhattanDistance//
create procedure manhattanDistance (in startGroup char(32), in endGroup char(32), out manhattanDistance decimal(2,1))
not deterministic
reads sql data
begin
drop table if exists tmp_items;
create temporary table tmp_items as select distinct item from exampleshares;
select sum(abs(ifnull(es1.share, 0) - ifnull(es2.share, 0))) into manhattanDistance
from tmp_items ti
left join exampleshares es1 on es1.item = ti.item and es1.group = startGroup
left join exampleshares es2 on es2.item = ti.item and es2.group = endGroup;
end//
delimiter ;
call manhattanDistance('group1', 'group2', @distanceBetweenGroup1And2);
select @distanceBetweenGroup1And2;