Question

使用下面的表作为示例并将列出的查询作为基本查询，我想添加一种方法来仅选择具有最大ID的行！无需进行第二次查询！

TABLE VEHICLES

id      vehicleName
-----   --------
1       cool car
2       cool car
3       cool bus
4       cool bus
5       cool bus
6       car
7       truck
8       motorcycle
9       scooter
10      scooter
11      bus

TABLE VEHICLE NAMES

nameId  vehicleName
------  -------
1       cool car
2       cool bus
3       car
4       truck
5       motorcycle
6       scooter
7       bus

TABLE VEHICLE ATTRIBUTES

nameId  attribute
------  ---------
1       FAST
1       SMALL
1       SHINY
2       BIG
2       SLOW
3       EXPENSIVE
4       SHINY
5       FAST
5       SMALL
6       SHINY
6       SMALL
7       SMALL

基本查询：

select a.*
  from vehicle         a
  join vehicle_names   b using(vehicleName)
  join vehicle_attribs c using(nameId)
 where c.attribute in('SMALL', 'SHINY')
 and a.vehicleName like '%coo%'
 group 
    by a.id
having count(distinct c.attribute) = 2;

所以我想要实现的是选择具有某些属性的行，这些属性与名称匹配，但每个名称只有一个条目匹配id最高的位置！

因此，此示例中的工作解决方案将返回以下行：

id      vehicleName
-----   --------
2       cool car
10      scooter

如果它在id上使用了某种最大值

此刻我收到了酷车和踏板车的所有参赛作品。

我的真实世界数据库遵循类似的结构，其中包含数十万个条目，因此上面的查询可以轻松返回3000多个结果。我将结果限制为100行以保持执行时间较低，因为结果用于我的网站上的搜索。我重复使用相同名称但只有不同ID的“车辆”的原因是新车型不断添加，但我保留旧车型，以便那些想要挖掘它们的人！但是在汽车名称的搜索中，我不想退回旧卡，而是最新的卡，这是最高ID的卡！

正确的答案会调整我上面提供的我正在使用的查询，并且只返回名称匹配但ID最高的行！

如果无法做到这一点，我们将不胜感激，建议如何在不大幅增加搜索执行时间的情况下实现我的目标！

Answer 1

如果你想保持逻辑，我会做的就是：

select a.*
from vehicle a
    left join vehicle a2 on (a.vehicleName = a2.vehicleName and a.id < a2.id)
    join vehicle_names   b on (a.vehicleName = b.vehicleName)
    join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
    and a.vehicleName like '%coo%'
    and a2.id is null
group by a.id
having count(distinct c.attribute) = 2;

哪个收益率：

+----+-------------+
| id | vehicleName |
+----+-------------+
|  2 | cool car    |
| 10 | scooter     |
+----+-------------+
2 rows in set (0.00 sec)

正如其他人所说，规范化可以在几个层面上完成：

将当前的vehicle_names表保留为主查找表，我会更改：

update vehicle a
    inner join vehicle_names b using (vehicleName)
set a.vehicleName = b.nameId;
alter table vehicle change column vehicleName nameId int;

create table attribs (
    attribId int auto_increment primary key,
    attribute varchar(20),
    unique key attribute (attribute)
);
insert into attribs (attribute)
    select distinct attribute from vehicle_attribs;
update vehicle_attribs a
    inner join attribs b using (attribute)
set a.attribute=b.attribId;
alter table vehicle_attribs change column attribute attribId int;

导致以下查询：

select a.id, b.vehicleName
from vehicle a
    left join vehicle a2 on (a.nameId = a2.nameId and a.id < a2.id)
    join vehicle_names b on (a.nameId = b.nameId)
    join vehicle_attribs c on (a.nameId=c.nameId)
    inner join attribs d using (attribId)
where d.attribute in ('SMALL', 'SHINY')
    and b.vehicleName like '%coo%'
    and a2.id is null
group by a.id
having count(distinct d.attribute) = 2;

Answer 2

该表似乎没有标准化，但这有助于您执行此操作：

select max(id), vehicleName
from VEHICLES
group by vehicleName
having count(*)>=2;

Answer 3

我不确定我是否完全理解您的模型，但以下查询满足您的要求。第一个子查询查找车辆的最新版本。第二个查询满足您的“和”条件。然后我加入对vehiclename的查询（这是关键？）。

select a.id
      ,a.vehiclename
  from (select a.vehicleName, max(id) as id
          from vehicle a
         where vehicleName like '%coo%'
        group by vehicleName
       ) as a
  join (select b.vehiclename
          from vehicle_names   b
          join vehicle_attribs c using(nameId)
         where c.attribute in('SMALL', 'SHINY') 
        group by b.vehiclename
        having count(distinct c.attribute) = 2
       ) as b on (a.vehicleName = b.vehicleName);

如果这个“最新车辆”逻辑是你需要做的事情，一个小建议就是创建一个视图（见下文），它返回每辆车的最新版本。然后您可以使用视图而不是find-max-query。请注意，这纯粹是为了易于使用，它没有性能优势。

select *
  from vehicle a
 where id = (select max(b.id)
               from vehicle b
              where a.vehiclename = b.vehiclename);

Answer 4

如果没有对你的模型进行适当的重新设计，你可以

1）添加一个应用程序可以管理的列IsLatest。

这不完美但会满足你的问题（直到下一个问题，最后看不到）您只需添加新条目即可发出查询，例如

UPDATE a
SET IsLatest = 0
WHERE IsLatest = 1

INSERT new a

UPDATE a
SET IsLatest = 1
WHERE nameId = @last_inserted_id

在交易或触发器中

2）或者，您可以在发出查询之前找到max_id

SELECT MAX(nameId)
FROM a
WHERE vehicleName = @name

3）您可以在单个SQL中执行此操作，并在（vehicleName，nameId）上提供索引，它实际上应该具有适当的速度

select a.*
  from vehicle         a
  join vehicle_names   b ON a.vehicleName = b.vehicleName
  join vehicle_attribs c ON b.nameId = c.nameId AND c.attribute = 'SMALL'
  join vehicle_attribs d ON b.nameId = c.nameId AND d.attribute = 'SHINY'
  join vehicle         notmax ON a.vehicleName = b.vehicleName AND a.nameid < notmax.nameid 
 where a.vehicleName like '%coo%'
       AND notmax.id IS NULL

我已删除了GROUP BY和HAVING，并将其替换为另一个连接（假设每个nameId只能使用一个属性）。

我还使用了一种方法来查找每组的最大值，即将表连接到自身并过滤掉一行，其中没有记录具有相同名称的更大ID。

还有其他方法，搜索'max per group sql'。另请参阅here，但不完整。

MySQL选择具有最大id并匹配其他条件的行

4 个答案: