定义用于GROUP BY的标准

时间:2012-10-30 12:13:14

标签: mysql join group-by

在下面的数据库中,我需要所有用户及其位置,没有重复,以及首先列出的新泽西州用户。 (如果他们有NJ地址,我不需要他们的其他地址。为了清楚起见,简化了数据库。)

> table address,

|  id  |  state |   City   |
|------|--------|----------|
|  01  |   NY   |  Gotham  |
|  02  |   NY   |  Uye     |
|  03  |   NJ   |  Hoboken |
|  04  |   NJ   |  Newark  |

> table contact

|  user  |  address  |
|--------|-----------|
|   01   |     01    |
|   02   |     02    |
|   02   |     03    |
|   03   |     04    |

以下是我的一些尝试及其输出。 根据我的阅读[我花了几个小时],我认为这被称为模糊的GROUP BY查询,并且只允许MySQL使用。

但是,我无法弄清楚这样做的正确方法。请帮忙!

期望的结果:

|  user  |  state  |
|--------|---------|
|   02   |   NJ    |
|   03   |   NJ    |
|   01   |   NY    |

其他尝试:

SELECT user, state FROM contact, address WHERE id = address;
// Duplicate users, and addresses I do not need.

|  user  |  state  |
|--------|---------|
|   01   |   NY    |
|   02   |   NY    |
|   02   |   NJ    |
|   03   |   NJ    |  

SELECT user, state FROM contact, address WHERE id = address GROUP BY user;
// NY address. I need the NJ address.

|  user  |  state  |
|--------|---------|
|   01   |   NY    |
|   02   |   NY    |
|   03   |   NJ    |

SELECT user, state FROM contact, address WHERE id = address GROUP BY user HAVING state = 'NJ';
//Worse, now I lose my NY users, and one of my NJ users doesn't even show 

|  user  |  state  |
|--------|---------|
|   03   |   NJ    |

3 个答案:

答案 0 :(得分:3)

它可能不太漂亮,但这个解决方案有效。我在这里更新了查询以反映小提琴中的变化:

SELECT user, state, `state` LIKE 'NJ' ismatch
FROM contact, address WHERE id = address
AND (state = 'NJ' OR `user` NOT IN 
    (SELECT `user` FROM contact, address WHERE address = id AND `state` LIKE 'NJ'))
GROUP BY user
ORDER BY ismatch DESC;

简单说明:如果他在新泽西州没有地址,请加入每个用户的地址在新泽西州的地址或他的任何地址。需要GROUP BY来防止重复用户显示在结果中。

基本上问题归结为,如果用户存在,则想要检索匹配的地址,否则只是获取该用户的任何地址。因此where子句中的第二个条件,只有当用户有一个NJ地址时才会拉出NJ地址,否则它只会正常地取第一个。 ismatch字段用于排序目的。

注意:我最初是在表之间使用JOIN编写的。为清晰起见,它们已被移除。

Link to SQL Fiddle

编辑:有人向我指出,这个解决方案在所有情况下都不起作用(见评论)。为了解决这个问题,我添加了一个GROUP BY,现在它似乎可行。我还用简单的比较替换了CASE以提高性能:SQLFiddle

此外,我所知道的具有数据库管理经验的人也向我建议了另一个答案。他的建议是两个查询(至少),所以我构建了这些查询并使用了UNION:

SELECT user, state, 1 ismatch FROM contact, address
WHERE id = address AND state = 'NJ' GROUP BY user
UNION
SELECT user, state, 0 ismatch FROM contact, address
WHERE id = address AND `user` NOT IN (SELECT `user` FROM contact, address WHERE address = id AND `state` LIKE 'NJ')
GROUP BY user ORDER BY ismatch DESC

简单说明:获取具有NJ地址的每个用户,然后如果他没有NJ地址则获取具有任何地址的每个用户,并且将两个结果集UNION一起。在两个查询中都使用GROUP BY来防止重复。

答案 1 :(得分:3)

有点姗姗来迟的答案,有点长篇大论,但标准的SQL;

SELECT DISTINCT c1.user,a1.state,a1.city
FROM contact c1
LEFT JOIN contact c2 
  ON c1.user = c2.user AND c1.address<>c2.address
JOIN address a1
  ON a1.id=c1.address
LEFT JOIN address a2 
  ON a2.id=c2.address AND
     ( a1.state<> 'NJ' AND a2.state='NJ' OR
       a1.state=a2.state AND a1.id>a2.id )
GROUP BY c1.user,a1.state,a1.city
HAVING MAX(a2.id) IS NULL

SQLfiddle here

编辑:为您找到的重复案例添加了修复程序。粗略解释查询;

对于每个用户,它将遍历联系人/地址(c1 / a1)的所有组合,并使用左连接将它们与同一用户的所有其他组合(a2 / c2)进行比较,如果a1,则将a2.id设置为NULL / c1是“更好”。如果地址在任何比较中没有“更糟糕”,那么它将是保留的地址。

答案 2 :(得分:1)

试试这个:

SELECT user, state FROM contact, address WHERE id = address 
and state='NJ'
union
SELECT min(user), state FROM contact, address WHERE id = address 
and state<>'NJ'
group by state

SQLFIDDLE DEMO