在SQL中查找类似的项目

时间:2014-04-07 21:14:54

标签: sql tsql

我有一个项目表:

╔════════╦═══════╦═══════╦════════╗
║ ItemID ║ Color ║ Size  ║ Smell  ║
╠════════╬═══════╬═══════╬════════╣
║ Z300   ║ black ║ big   ║ stinky ║
║ Z200   ║ white ║ big   ║ stinky ║
║ Z100   ║ black ║ small ║ stinky ║
║ Z050   ║ black ║ small ║ yummy  ║
╚════════╩═══════╩═══════╩════════╝

我想说我想找到与Z300类似的物品。他们只能被认为是类似的"如果2/3(颜色,大小,气味)匹配它。所以Z200和Z100会匹配,但Z050不会因为它只匹配1/3而不匹配。我需要帮助编写SQL查询来生成它。

感谢您的帮助。

6 个答案:

答案 0 :(得分:2)

这应该接近你所需要的。 我添加了一行与任何其他项目不相似的额外数据,以显示没有匹配时会发生什么。如果需要,可以在查询中添加where子句以限制为单个基本项。

DECLARE @Items TABLE (
    ItemId      VARCHAR(16),
    Color       VARCHAR(16),
    Size        VARCHAR(16),
    Smell       VARCHAR(16)
);
INSERT @Items 
SELECT 'Z300', 'black', 'big', 'stinky'
UNION SELECT 'Z200', 'white', 'big', 'stinky'
UNION SELECT 'Z100', 'black', 'small', 'stinky'
UNION SELECT 'Z050', 'black', 'small', 'yummy'
UNION SELECT 'Z025', 'yellow', 'medium', 'tasty'

SELECT
    Base.ItemId AS BaseItemId, 
    Base.Color AS BaseItemColor, 
    Base.Size AS BaseItemSize, 
    Base.Smell AS BaseItemSmell,
    Sim.ItemId AS SimilarItemId,
    Sim.Color AS SimilarItemColor,
    Sim.Size AS SimilarItemSize,
    Sim.Smell AS SimilarItemSmell
FROM @Items AS Base
LEFT JOIN @Items AS Sim
ON ( 
    (Base.Color = Sim.Color AND Base.Size = Sim.Size ) OR
    (Base.Color = Sim.Color AND Base.Smell = Sim.Smell ) OR
    (Base.Size = Sim.Size AND Base.Smell = Sim.Smell ) 
   ) AND Base.ItemId != Sim.ItemId;

答案 1 :(得分:1)

快速,本地测试(使用Postgres,但是当你删除public.前缀时也应该在MySQL上工作):

select
    foo2.*
from
    public.foo as foo1
left join
    public.foo as foo2 on (
        foo1.Color = foo2.Color and foo1.Size  = foo2.Size  or
        foo1.Size  = foo2.Size  and foo1.Smell = foo2.Smell or
        foo1.Smell = foo2.Smell and foo1.Color = foo2.Color
    )
where
    foo1.id = 'Z300';

答案 2 :(得分:0)

您可以轻松地根据您的要求扩展此查询。对于6/7,您将有7个OR条件。

SELECT
  DISTINCT T1.*
FROM tbl T1
   JOIN tbl T2
      ON T1.Color =T2.Color AND T1.Size = T2.Size
       OR T1.Color =T2.Color AND T1.Smell = T2.Smell
       OR T1.Size =T2.Size AND T1.Smell = T2.Smell

答案 3 :(得分:0)

我认为"将比赛加起来"如果添加更多属性,则更易于维护。

select a.*, b.*
from mycars a
inner join mycars b
on (
    case when a.Color = b.Color  then 1 else 0 end
    case when a.Size = b.Size  then 1 else 0 end
    case when a.Smell = b.Smell  then 1 else 0 end)  > 1
  and a.ItemID > b.ItemID

答案 4 :(得分:0)

这应该是可扩展的,而不必有很多附加条款。 唯一的问题是它不会找到完全匹配的东西。

SELECT
  DISTINCT T1.*
FROM Items T1
   JOIN Items T2
      ON (T1.Color <> T2.Color)
       XOR NOT (T1.Size <> T2.Size)
       XOR NOT (T1.Smell <> T2.Smell)
WHERE T2.ItemID = 'Z300'

http://sqlfiddle.com/#!2/d2034e/7

答案 5 :(得分:0)

如果您使用的是SQLServer 2008 R2或更新版本,则可以使用该表的未转换版本自行加入,然后检查有效条件的数量,如下所示:

WITH info AS (
   SELECT ItemID, property, value
   FROM (SELECT itemid, color, size, smell FROM data) p
   UNPIVOT
   (value FOR property IN (color, size, smell)) AS unpvt
)
SELECT data.itemID, info.ItemID similar
FROM   data
       INNER JOIN info on value in (color, size, smell)
GROUP BY data.itemID, info.ItemID
HAVING count(info.ItemID) = 2

SQLFiddle