两行mysql之间的匹配数

时间:2015-02-07 08:39:43

标签: mysql numbers

所以,这就是挑战:

我有两张桌子:

标准具:

+-----+-----+-----+-----+----+
|  e1 |  e2 |  e3 |  e4 | e5 |
+-----+-----+-----+-----+----+
|  01 |  02 |  03 |  04 | 05 |
+-----+-----+-----+-----+----+

候选人:

+-----+----+-----+-----+-----+----+----+
| ID  | c1 | c2  | c3  | c4  | c5 | nn |
+-----+----+-----+-----+-----+----+----+
| 00  | 03 | 08  | 02  | 01  | 06 | ** |
+-----+----+-----+-----+-----+----+----+
| 01  | 05 | 04  | 03  | 02  | 01 | ** |
+-----+----+-----+-----+-----+----+----+
| 02  | 06 | 07  | 08  | 09  | 10 | ** |
+-----+----+-----+-----+-----+----+----+
| 03  | 08 | 06  | 09  | 02  | 07 | ** |
+-----+----+-----+-----+-----+----+----+

我应该使用什么请求来查找和保存(在nn列中)每行中的两行(e1,e2,e3,e4,e5和c1,c2,c3,c4,c5)之间的匹配数表候选人?

应该是下一个结果:

考生:

|-----|----|-----|-----|-----|-----|----|
| ID  | c1 | c2  | c3  | c4  | c5  | nn |
|-----|----|-----|-----|-----|-----|----|
| 00  | 03 | 08  | 02  | 01  | 06  | 03 |
|-----|----|-----|-----|-----|-----|----|
| 01  | 05 | 04  | 03  | 02  | 01  | 05 |
|-----|----|-----|-----|-----|-----|----|
| 02  | 06 | 07  | 08  | 09  | 10  | 00 |
|-----|----|-----|-----|-----|-----|----|
| 03  | 08 | 06  | 09  | 02  | 07  | 01 |
|-----|----|-----|-----|-----|-----|----|

nn的结果是:

0 - no matches
1,2,3,4,5 - numbers of matches 

我怎样才能做到这一点?

1 个答案:

答案 0 :(得分:0)

目标是在主表行和客户表的每一行之间建立最大的部分匹配,而不考虑相应的列标识。

我们的想法是通过以另一种方式表示列内容来抽象列id。如您所示,值域为{1, ..., 10},可以选择前10个素数{p_1, ...,p_10} = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 },将i映射到p_i。比较将基于映射列值的乘积。这种方法利用了素因子化的唯一性,即。每个正整数都会分解为一组唯一的素数。

一次性独立的sql update语句写下来相当麻烦,因此我们创建一个包含映射值的乘积的临时表:

CREATE TEMPORARY TABLE t_pp (
      id            NUMBER
    , mp_candidates NUMBER
    , mp_etalon     NUMBER
    , nn            NUMBER
);
INSERT INTO t_pp ( id, mp_candidates, mp_etalon )
     SELECT id
          ,   CASE c1
                  WHEN  1 THEN  2
                  WHEN  2 THEN  3
                  WHEN  3 THEN  5
                  WHEN  4 THEN  7
                  WHEN  5 THEN 11
                  WHEN  6 THEN 13
                  WHEN  7 THEN 17
                  WHEN  8 THEN 19
                  WHEN  9 THEN 23
                  WHEN 10 THEN 29
                  ELSE         31
              END
            * CASE c2 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
            * CASE c3 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
            * CASE c4 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
            * CASE c5 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
                mp_candidates

          ,   CASE e1
                  WHEN  1 THEN  2
                  WHEN  2 THEN  3
                  WHEN  3 THEN  5
                  WHEN  4 THEN  7
                  WHEN  5 THEN 11
                  WHEN  6 THEN 13
                  WHEN  7 THEN 17
                  WHEN  8 THEN 19
                  WHEN  9 THEN 23
                  WHEN 10 THEN 29
                  ELSE         31
              END
            * CASE e2 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
            * CASE e3 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
            * CASE e4 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
            * CASE e5 WHEN  2 THEN  3 WHEN  3 THEN  5 WHEN  4 THEN  7 WHEN  5 THEN 11 WHEN  6 THEN 13 WHEN  7 THEN 17 WHEN  8 THEN 19 WHEN  9 THEN 23 WHEN 10 THEN 29 ELSE 31 END
                mp_etalon
          , 0   nn
       FROM candidates
 CROSS JOIN etalon     
          ;

现在传递#2 - 计算匹配:

UPDATE t_pp
   SET nn =
             CASE WHEN mp_candidates MOD  2 = 0 AND mp_etalon MOD  2 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD  3 = 0 AND mp_etalon MOD  3 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD  5 = 0 AND mp_etalon MOD  5 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD  7 = 0 AND mp_etalon MOD  7 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD 11 = 0 AND mp_etalon MOD 11 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD 13 = 0 AND mp_etalon MOD 13 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD 17 = 0 AND mp_etalon MOD 17 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD 19 = 0 AND mp_etalon MOD 19 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD 23 = 0 AND mp_etalon MOD 23 = 0  THEN 1 ELSE 0 END
           + CASE WHEN mp_candidates MOD 29 = 0 AND mp_etalon MOD 29 = 0  THEN 1 ELSE 0 END
     ;

最后,将结果传输到原始表并清理:

UPDATE candidates c
   set nn = ( SELECT p.nn FROM t_pp p WHERE p.id = c.id )
     ;
DELETE TEMPORARY TABLE t_pp;

更多说明:

  • 如图所示的方案假定单元格值在每行中是唯一的。但是,它可以很容易地扩展到允许多次出现的值。
  • 原则上,这可以包装在单个sql语句中 - 出于显而易见的原因,不建议这样做。
  • 除mysql之外的Rdbms遵循sql标准并提供WITH子句,不需要temporaray表。
  • 上述31表达式的ELSE分支中的值CASE是虚拟值。