SQL查询大表大小

时间:2013-04-04 21:24:12

标签: sql

需要帮助在SQL数据库中查找类似的值。表结构如:

    id         |        item_id_nm |      height |    width |     length |     weight
    ----------------------------------------------------------------------------------
    1          |       00000000001 |      1.0    |     1.0  |        1.0 |         1.0
    2          |       00000000001 |      1.1    |     1.0  |        0.9 |         1.1
    3          |       00000000001 |      2.0    |     1.0  |        1.0 |         1.0
    4          |       00000000002 |      1.0    |     1.0  |        1.0 |         1.0
    5          |       00000000002 |      1.0    |     1.1  |        1.1 |         1.0
    6          |       00000000002 |      1.0    |     1.0  |        1.0 |         2.0

id显然不能有重复项,item_id_nm可以有重复项(实际上可以多次出现,也就是> 2)。

如何形成SQL以查找重复的item_id_nm,但只有当高度或宽度或长度或重量的值相差≥1时才会出现。 30%。

我知道它需要遍历表格,但我该如何进行检查。谢谢你的帮助。

编辑:包含%30差异的示例。 id = 3,与id 1和2的1.0(或1.1)高度相差200%。很抱歉不清楚,但每个高度,宽度,长度或重量值都可能有30%的差异。如果其中一个有30%的差异,它将被视为其他的重复。

4 个答案:

答案 0 :(得分:3)

这会给你的行与平均值相差30%或更多:

SELECT t1.*
FROM tbl t1
INNER JOIN (
    SELECT
         item_id_nm,
        AVG(width) awidth, AVG(height) aheight, 
        AVG(length) alength, AVG(weight) aweight
    FROM tbl
    GROUP BY item_id_nm ) t2
USING (item_id_nm)
WHERE 
    width > awidth * 1.3 OR width < awidth * 0.7
    OR height > aheight * 1.3 OR height < aheight * 0.7
    OR length > alength * 1.3 OR length < alength * 0.7
    OR weight > aweight * 1.3 OR weight < aweight * 0.7

这个应该给你不同30%的行对:

SELECT t1.*,t2.*
FROM tbl t1
INNER JOIN tbl t2
USING (item_id_nm)
WHERE 
     (t1.width > t2.with * 1.3 OR t1.width < t2.width * 0.7)
    OR (t1.height > t2.height * 1.3 OR t1.height < t2.height * 0.7)
    OR (t1.length > t2.length * 1.3 OR t1.length < t2.length * 0.7)
    OR (t1.weight > t2.weight * 1.3 OR t1.weight < t2.weight * 0.7)

答案 1 :(得分:2)

我认为你可以使用这样的东西:

SELECT item_id_nm
FROM yourtable
GROUP BY item_id_nm
HAVING
  MIN(height)*1.3 < MAX(height) OR
  MIN(width)*1.3 < MAX(width) OR
  MIN(length)*1.3 < MAX(length) OR
  MIN(weight)*1.3 < MAX(weight)

答案 2 :(得分:2)

SELECT
    *
FROM
    TableName
WHERE
   (height > 1.3 * width OR height < 0.7 width) OR
   (length > 1.3 * width OR length < 0.7 width)
GROUP BY
    item_id_nm
HAVING
    COUNT(item_id_nm) > 1

答案 3 :(得分:0)

我会用:

SELECT s1.id AS id1, s2.id AS id2
, s1.height AS h1, s2.height as h2
, s1.width as width1, s2.width as width2
, s1.length as l1, s2.length as l2
, s1.weight as weight1, s2.weight as weight2
FROM stack s1
INNER JOIN stack s2
ON s1.item_id_nm = s2.item_id_nm
WHERE s1.id != s2.id
AND s1.id < s2.id
AND (abs(100-((s2.height*100)/s1.height)) > 30
OR abs(100-((s2.width*100)/s1.width)) > 30
OR abs(100-((s2.length*100)/s1.length)) > 30
OR abs(100-((s2.weight*100)/s1.weight)) > 30)

使用PostgreSQL(http://sqlfiddle.com/#!12/e5f25/15)。此代码不返回重复的行。