sql查询没有正确分组

时间:2012-04-04 17:15:32

标签: mysql group-by

由于某种原因,下面的查询允许重复的名称。那是为什么?

SELECT id, name_without_variants, SUM(relevance) as total_relevance FROM (
    SELECT 
        card_definitions.id, 
            card_definitions.name_without_variants,
        (MATCH(card_definitions.name_without_variants) AGAINST ('lost soul site discard')) * 0.40 AS relevance
        FROM card_definitions
        GROUP BY name_without_variants, id
    UNION
    SELECT 
        card_definitions.id,
            card_definitions.name_without_variants,
        (MATCH(card_def_identities.special_ability_text) AGAINST ('lost soul site discard')) * 0.05 AS relevance
        FROM card_def_identities 
        INNER JOIN card_definitions ON card_def_identities.card_def_sid = card_definitions.id 
        GROUP BY name_without_variants, id
    UNION
    SELECT 
        card_definitions.id,
            card_definitions.name_without_variants,
        (MATCH(brigades.brigade_color) AGAINST ('lost soul site discard')) * 0.30 AS relevance
        FROM brigades 
        INNER JOIN card_def_brigades ON brigades.id = card_def_brigades.brigade_sid
        INNER JOIN card_definitions ON card_def_brigades.card_def_sid = card_definitions.id 
        GROUP BY name_without_variants, id
    UNION
    SELECT 
        card_definitions.id,
            card_definitions.name_without_variants,
        (MATCH(identifiers.identifier) AGAINST ('lost soul site discard')) * 0.20 AS relevance
        FROM identifiers
        INNER JOIN card_def_identifiers ON identifiers.id = card_def_identifiers.identifier_sid
        INNER JOIN card_definitions on card_def_identifiers.card_def_sid = card_definitions.id 
        GROUP BY name_without_variants, id
    UNION
    SELECT 
        card_definitions.id,
            card_definitions.name_without_variants,
        (MATCH(card_effects.effect) AGAINST ('lost soul site discard')) * 0.05 AS relevance
        FROM card_effects
        INNER JOIN card_def_effects ON card_effects.id = card_def_effects.effect_sid
        INNER JOIN card_definitions on card_def_effects.card_def_sid = card_definitions.id 
        GROUP BY name_without_variants, id
    ) AS combined_search 
GROUP BY name_without_variants, id
HAVING total_relevance > 0
ORDER BY total_relevance DESC
LIMIT 10;

这是我得到的结果。请注意两个Lost Soul [Site Doubler]

2623    Lost Soul [Deck Discard]    6.35151714086533
1410    Lost Soul [Hand Discard]    6.29273346662521
1495    Lost Soul [Discard Card]    5.93360201716423
1442    Lost Soul [Demon Discard]   5.91308708190918
1497    Lost Soul [Site Doubler]    5.05888686180115
1498    Lost Soul [Site Doubler]    5.05888686180115
2572    Lost Soul [Site Guard]  4.82421946525574
2774    Lost Soul [Far Country] 3.39325473308563
2891    Fortify Site [RoA2] 2.77084048986435
1418    Lost Soul [Hopper]  2.63041100502014

2 个答案:

答案 0 :(得分:2)

由于ID不同而您按ID分组,因此每个ID都会有多行,这就是GROUP BY的作用。如果您将顶级SELECT更改为

SELECT name_without_variants, SUM(relevance) as total_relevance

和外部GROUP BY:

GROUP BY name_without_variants

你应该看到不同的名字,但不再有id。

答案 1 :(得分:0)

GROUP BY name_without_variants, id

您按照name_without_variants,id进行分组。两个记录的ID不同:

1497    Lost Soul [Site Doubler]    5.05888686180115
1498    Lost Soul [Site Doubler]    5.05888686180115

您需要决定如何管理ID。

从group by中删除id,并将一个聚合函数添加到select中的id列。或者只是将所有列一起删除。

这是一个简化为单个查询的示例。请理解我对您的架构或数据没有完整的观点,也没有经过测试。我也在这里做一些假设。但是,如果架构是关系型的,那么这应该会带回您正在寻找的内容:

SELECT cd.id, cd.name_without_variants, (((MATCH(cd.name_without_variants) AGAINST ('lost soul site discard')) * 0.40)+
                                   ((MATCH(cdi.special_ability_text) AGAINST ('lost soul site discard')) * 0.05)+
                                   ((MATCH(b.brigade_color) AGAINST ('lost soul site discard')) * 0.30)+
                                   ((MATCH(i.identifier) AGAINST ('lost soul site discard')) * 0.20)+
                                   ((MATCH(ce.effect) AGAINST ('lost soul site discard')) * 0.05)
                                  ) as total_relevance 
FROM card_definitions cd 
 LEFT OUTER JOIN card_def_identities cdi ON cd.id=cdi.card_def_sid
 LEFT OUTER JOIN brigades b ON cd.id=b.card_def_sid
 LEFT OUTER JOIN identifiers i ON i.id=cdi.identifier_sid
 LEFT OUTER JOIN card_def_effects cde ON cde.card_def_sid=cd.id
 LEFT OUTER JOIN card_effects ce ON ce.id=cde.effect_sid
GROUP BY cd.id, cd.name_without_variants
HAVING total_relevance > 0
ORDER BY total_relevance DESC
LIMIT 10;