在MySQL的分组列中搜索?

时间:2011-10-19 00:33:23

标签: mysql sql

我需要创建一个人员数据库,男人可以有一个或多个属性,每个人的属性都有一个特定的值,听起来很容易吗?好吧,继续阅读,因为问题有点不可能(5天处理它:s)。

所以我创建了这3个表:

CREATE TABLE guy (
  id int(11),
  name varchar(255)
);

CREATE TABLE attribute (
  id int(11),
  name varchar(255)
);

-- each value references one guy and one attribute
CREATE TABLE _value (
  id int(11),
  guy_id int(11),
  attribute_id int(11),
  _value varchar(255)
);

使用此示例数据:

INSERT INTO attribute VALUES (1, 'age'), (2, 'dollars'), (3, 'candies');
INSERT INTO guy VALUES (1, 'John'), (2, 'Bob');
INSERT INTO _value VALUES (1, 1, 1, 12), (2, 1, 2, 15), (3, 1, 3, 3);
INSERT INTO _value VALUES (4, 2, 1, 15), (5, 2, 2, 20), (6, 2, 3, 6);

并创建此查询:

SELECT g.name 'guy', a.name 'attribute', v._value 'value' 
FROM guy g 
JOIN _value v ON g.id = v.guy_id 
JOIN attribute a ON a.id = v.attribute_id;

给了我这个结果:

+------+-----------+-------+
| guy  | attribute | value |
+------+-----------+-------+
| John | age       | 12    |
| John | dollars   | 15    |
| John | candies   | 3     |
| Bob  | age       | 15    |
| Bob  | dollars   | 20    |
| Bob  | candies   | 6     |
+------+-----------+-------+

这是真正的问题:

后来,我的老板告诉我,他希望使用尽可能多的条件来过滤数据,因为他希望能够用“ands”和“ors”对这些条件进行分组,例如,他可能想要做这种疯狂的情况:

获得年龄大于10岁,小于18美元的人,拥有2个以上的糖果和不到10个糖果,但无论如何,还包括年龄正好为15岁的男性。 这将转换为此过滤器:

-- should return both John and Bob
(age > 10 and dollars < 18 and candies > 2 and candies < 10) or (age = 15)

我创建过滤器没有问题(我使用jqgrid),问题是属性不是列,而是行,因此我不知道如何将查询与过滤器混合,我尝试过这样的事情:

SELECT g.name 'guy', a.name 'attribute', v._value 'value' 
FROM guy g 
JOIN _value v ON g.id = v.guy_id 
JOIN attribute a ON a.id = v.attribute_id
GROUP BY guy
HAVING (
    (attribute = 'age' and value > 10) AND
    (attribute = 'dollars' and value < 18) AND
    (attribute = 'candies' and value > 2) AND
    (attribute = 'candies' and value < 10)
       )
OR
       (
     (attribute = 'age' and value = 15)
       )

但只返回Bob :(我应该得到John和Bob。

那么,我应该如何混合过滤器和查询?

请记住,每个人拥有的属性数量对于所有人来说都是相同的,但是可以随时添加更多属性和更多人,例如,如果我想添加'Mario'那个人我会做:< / p>

-- we insert the guy Mario
INSERT INTO guy VALUES (3, 'Mario');
-- with age = 5, dollars = 100 and candies = 1
INSERT INTO _value VALUES (7, 3, 1, 5), (8, 3, 2, 100), (9, 3, 3, 1);

如果我想创建属性'apples',我会这样做:

-- we insert the attribute apples
INSERT INTO attribute VALUES (4, 'apples');
-- we create a value for each guy's new attribute, John as 7 apples, Bob has 3 and Mario has 8
INSERT INTO _value VALUES (10, 1, 4, 7), (11, 2, 4, 2), (12, 3, 4, 8);

现在我应该能够在查询中包含有关苹果的条件。

我希望我能让自己理解,谢谢你所有的时间:)

注意:也许如果有办法将每个人的所有属性放在一行?,就像这样:

+------+-----------+-------+------+------------+--------+------+------------+--------+------+------------+--------+
| guy  | attribute | value | guy  | attribute  | value  | guy  | attribute  | value  | guy  | attribute  | value  |
+------+-----------+-------+------+------------+--------+------+------------+--------+------+------------+--------+
| John | age       |    12 | John | dollars    |     15 | John | candies    |      3 | John | apples     |      7 |
| Bob  | age       |    15 | Bob  | dollars    |     20 | Bob  | candies    |      6 | Bob  | apples     |      2 |
| Mario| age       |    5  | Mario| dollars    |     100| Mario| candies    |      1 | Mario| apples     |      8 |
+------+-----------+-------+------+------------+--------+------+------------+--------+------+------------+--------+

注意2:@iim建议(在这个问题中:How to search in grouped columns in MySQL? (also in Hibernate if possible))我可以为每个属性进行自我加入,是的,这可以解决问题,但是当人们有吨时可能会出现性能问题属性(如30或更多)。

注3:我无法更改数据库架构:(

5 个答案:

答案 0 :(得分:2)

这样的事情怎么样?

SELECT g.name 'guy', a.name 'attribute', v._value 'value' 
FROM guy g 
JOIN _value v1 ON g.id = v1.guy_id 
  JOIN attribute a1 ON a1.id = v1.attribute_id
JOIN _value v2 ON g.id = v2.guy_id 
  JOIN attribute a2 ON a2.id = v2.attribute_id
JOIN _value v3 ON g.id = v3.guy_id 
  JOIN attribute a3 ON a3.id = v3.attribute_id
JOIN _value v4 ON g.id = v4.guy_id 
  JOIN attribute a4 ON a4.id = v4.attribute_id
JOIN _value v5 ON g.id = v5.guy_id 
  JOIN attribute a5 ON a5.id = v5.attribute_id
WHERE (
    (a1 = 'age' and v1 > 10) AND
    (a2 = 'dollars' and v2 < 18) AND
    (a3 = 'candies' and v3 > 2) AND
    (a4 = 'candies' and v4 < 10)
  ) OR (a5 = 'age' and v5 = 15)

编辑修复一些愚蠢的错误:

SELECT DISTINCT g.id, g.name 'guy'
FROM guy g 
JOIN _value v1 ON g.id = v1.guy_id 
  JOIN attribute a1 ON a1.id = v1.attribute_id
JOIN _value v2 ON g.id = v2.guy_id 
  JOIN attribute a2 ON a2.id = v2.attribute_id
JOIN _value v3 ON g.id = v3.guy_id 
  JOIN attribute a3 ON a3.id = v3.attribute_id
JOIN _value v4 ON g.id = v4.guy_id 
  JOIN attribute a4 ON a4.id = v4.attribute_id
JOIN _value v5 ON g.id = v5.guy_id 
  JOIN attribute a5 ON a5.id = v5.attribute_id
WHERE (
    (a1.name = 'age' and v1._value > 10) AND
    (a2.name = 'dollars' and v2._value < 18) AND
    (a3.name = 'candies' and v3._value > 2) AND
    (a4.name = 'candies' and v4._value < 10)
  ) OR (a5.name = 'age' and v5._value = 15)

具体来说,我忘记了WHERE子句中的字段名,只选择'guy'字段,并添加DISTINCT以便每个人只能获得一行。

答案 1 :(得分:1)

这样的事情可能是一个选择:

select g.name as guy
from guy g
join _value v on g.id = v.guy_id
join attribute a on a.id = v.attribute_id
where (a.name = 'age'     and v._value > 10)
   or (a.name = 'dollars' and v._value < 18)
   or (a.name = 'candies' and v._value > 2)
group by g.name
having count(*) = 3

union

select g.name as guy
from guy g
join _value v on g.id = v.guy_id
join attribute a on a.id = v.attribute_id
 where (a.name = 'age' and v._value = 15)
group by g.name       -- These two clauses are not necessary,
having count(*) = 1   -- they're just her for symmetry

您将外部“或”条件转换为UNION,并且您的“和”条件可以在通常的“having count(*)匹配条件数”中处理。

我不知道这种方法是否适用于老板要你做的一切,但也许会有所帮助。

答案 2 :(得分:1)

如果问题是“问题是属性不是列,而是行而不是”,那么视图如何。您无法更改数据库架构,但可以考虑以下视图:

CREATE VIEW the_attributes as 
  select a.id, a.name as attribute_name, v._value
  from attribute a JOIN value v
  ON v.attribute_id = a.id

从此开始可能会更好。

然后我认为你应该能够做到:

select guy.id from guy JOIN the_attributes ON the_attributes.guy_id = guy.id
where 
the_attributes.name = 'age' and _value > 10 and
the_attributes.name = 'dollar' and _value < 18 and
the_attributes.name = 'candies' and _value > 2 and
the_attributes.name = 'candies' and _value <10 ) or
the_attributes.name = 'age' and _value = 15 ) 

这一切是否最终有助于你必须判断,但这是我最初想到的问题。当然看起来可读;(

答案 3 :(得分:1)

以下内容可以让您的条件或多或少直截了当,但我不能保证,对于拥有30多个属性的100,000多名玩家来说,它会非常有效。你应该亲自看看。

SELECT g.name guy, a.name attribute, v._value value
FROM guy g 
JOIN _value v ON g.id = v.guy_id 
JOIN attribute a ON a.id = v.attribute_id
GROUP BY guy
HAVING (
    SUM(a.name = 'age'     and v._value > 10) = 1 AND
    SUM(a.name = 'dollars' and v._value < 18) = 1 AND
    SUM(a.name = 'candies' and v._value > 2 ) = 1 AND
    SUM(a.name = 'candies' and v._value < 10) = 1
       )
OR
       (
    SUM(a.name = 'age'     and v._value = 15) = 1
       )

(我在这里假设一个人不能拥有重复的属性。)

答案 4 :(得分:0)

试试这个,也许这会有所帮助。

SELECT g.name 'guy', a.name 'attribute', v._value 'value' 
FROM guy g 
JOIN _value v ON g.id = v.guy_id 
JOIN attribute a ON a.id = v.attribute_id
WHERE a.ID = v.attribute_ID
      AND v._value = 'values you want'
      AND  NOT v._value = 'values you don''t want'

如果您还有其他需要,请告诉我。