MySQL:如何检索具有重复字段的所有行,匹配任何匹配行中某个字段的LIKE

时间:2012-09-19 22:00:02

标签: mysql

我有一个包含大约300,000行产品信息的数据库 我需要检索重复UPC (COUNT(upc)> 1)的行,其中至少一个结果'描述与某个字符串匹配(“Reed”,例如。)

例如,将全部选择以下行(desc,upc pair)

Deer D7394    62226173
Reed R2536    62226173
Deer D7217    62226173

但没有

Deer D0173    62278389
Deer D7289    62278389
Deer D9272    62278389

以下是我正在使用的查询:

SELECT a.desc, a.upc, a.sku, a.short_description 
FROM inventory a 
JOIN 
    (SELECT upc, desc 
    FROM inventory 
    GROUP BY upc 
    HAVING COUNT(upc) > 1) b 
ON a.upc = b.upc 
WHERE ((a.desc LIKE '%Reed%') OR (b.desc LIKE '%Reed%'))
AND a.upc != '' 
AND a.upc != 0 
ORDER BY upc;

我对MySQL比较陌生,但这似乎应该可行。但是,某些结果无法返回不匹配的行(即将返回Reed R2536,但不会返回Deer D7394)。

非常感谢任何见解!

3 个答案:

答案 0 :(得分:3)

当重复数量很少时,Brian的group_concat方法将起作用,但是当它没有时,它将无声地失败。你永远不会知道;你只会丢失应该存在的行。

您要做的是选择至少有一个描述匹配的所有UPC(以及存在重复项的UPC),然后从该列表中选择与每个UPC匹配的所有行。

如果您按UPC对所有项目进行分组,那么您可以使用计数对每个项目进行注释,并标记是否有任何描述匹配:

SELECT upc, COUNT(*) c, MAX(`desc` LIKE '%Reed%') desc_matches
FROM inventory
GROUP BY upc

(这利用了这样一个事实,即布局运算符,如LIKE,实际上返回0表示false,1表示true表示。取该列的最大值可以告诉您是否有任何行匹配的)

然后您可以根据您的条件过滤该列表,以获得您感兴趣的UPC:

SELECT upc, COUNT(*) c, MAX(`desc` LIKE '%Reed%') desc_matches
FROM inventory
GROUP BY upc
HAVING desc_matches = 1 AND c > 1

获得该列表后,您希望查看与这些UPC中的任何一个匹配的所有产品。你可以通过一个简单的(不是OUTER)连接来做到这一点:

SELECT a.desc, a.upc, a.sku, a.short_description 
FROM inventory a 
JOIN 
    ( SELECT upc, COUNT(*) c, MAX(`desc` LIKE '%Reed%') desc_matches
      FROM inventory
      GROUP BY upc
      HAVING desc_matches = 1 AND c > 1
    ) b USING (upc)

答案 1 :(得分:1)

另一种可能的方法,假设你没有太多重复记录,那就是:

select * from inventory i
  join (
         SELECT upc 
           FROM inventory 
            GROUP BY upc 
            HAVING COUNT(upc) > 1
              and group_concat(`desc`) like '%reed%') as available_upc 
          on available_upc.upc = i.upc

这假设你的表看起来像:

CREATE TABLE inventory(
  sku CHAR(32) NOT NULL,
  `desc` CHAR(32) NOT NULL,
  upc CHAR(32) NOT NULL,
  short_description CHAR(32) NOT NULL,
  PRIMARY KEY (sku)
);

insert into inventory values ('D7394','Deer','62226173','Small Deer');
insert into inventory values ('R2536','Reed','62226173','Small Reed');
insert into inventory values ('D7217','Deer','62226173','Large Deer');


insert into inventory values ('D0173','Deer','62278389','Small Deer');
insert into inventory values ('D7289','Deer','62278389','Small Reed');
insert into inventory values ('D9272','Deer','62278389','Large Deer');

答案 2 :(得分:0)

很难说没有经过测试,但请尝试:

SELECT a.desc, a.upc, a.sku, a.short_description 
FROM inventory a 
OUTER RIGHT JOIN 
    (SELECT upc
    FROM inventory 
    GROUP BY upc 
    HAVING COUNT(upc) > 1) b 
ON a.upc = b.upc 
WHERE ((a.desc LIKE '%Reed%') OR (b.desc LIKE '%Reed%'))
AND a.upc != '' 
AND a.upc != 0 
ORDER BY upc;

关键是OUTER RIGHT JOIN。请参阅文章:http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins

此外,您只需要从内部SELECT查询返回upc。