为了对分组数据进行排序,请删除子查询

时间:2016-04-07 08:43:43

标签: mysql query-optimization

CREATE TABLE `aircrafts_in` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `city_from` int(11) NOT NULL COMMENT 'Откуда',
 `city_to` int(11) NOT NULL COMMENT 'Куда',
 PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=91 DEFAULT CHARSET=utf8 COMMENT='Самолёты по направлениям'

CREATE TABLE `aircrafts_in_parsed_data` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `price` int(11) NOT NULL COMMENT 'Ценник',
 `airline` varchar(255) NOT NULL COMMENT 'Авиакомпания',
 `date` date NOT NULL COMMENT 'Дата вылета',
 `info_id` int(11) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `info_id` (`info_id`),
 KEY `price` (`price`),
 KEY `date` (`date`)
) ENGINE=InnoDB AUTO_INCREMENT=940682 DEFAULT CHARSET=utf8

日期 - 出发日期

CREATE TABLE `aircrafts_in_parsed_info` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `status` enum('success','error') DEFAULT NULL,
 `type` enum('roundtrip','oneway') NOT NULL,
 `date` datetime NOT NULL COMMENT 'Дата парсинга',
 `aircrafts_in_id` int(11) DEFAULT NULL COMMENT 'ID направления',
 PRIMARY KEY (`id`),
 KEY `aircrafts_in_id` (`aircrafts_in_id`)
) ENGINE=InnoDB AUTO_INCREMENT=577759 DEFAULT CHARSET=utf8

日期 - 创建日期,解析时间

任务

获得每月最低票价和出发日期。请注意,最低价格是相关的,而不仅仅是最低价格。如果多个日期的成本最低,我们需要第一个。

我的解决方案

我认为有些事情不太对劲。 我不喜欢用于分组的子查询,如何解决此问题

select *
from (
    select * from (
        select airline,
        price,
        pdata.`date` as `date`
        from aircrafts_in_parsed_data `pdata`
        inner join aircrafts_in_parsed_info `pinfo`
        on pdata.`info_id` = pinfo.`id`
        where pinfo.`aircrafts_in_id` = {$id}
            and pinfo.status = 'success'
            and pinfo.`type` = 'roundtrip'
            and `price` <> 0
        group by pdata.`date`, year(pinfo.`date`) desc, month(pinfo.`date`) desc, day(pinfo.`date`) desc
    ) base
    group by `date`
    order by price, year(`date`) desc, month(`date`) desc, day(`date`) asc
) minpriceperdate
group by year(`date`) desc, month(`date`) desc

没有缓存需要0.015秒,表格大小可以在自动增量中查看

2 个答案:

答案 0 :(得分:0)

SELECT  MIN(price) AS min_price,
        LEFT(date, 7) AS yyyy_mm
    FROM aircrafts_in_parsed_data
    GROUP BY LEFT(date, 7)

将获得每个月的最低价格。但它不能说“先”#。

my groupwise-max cheat-sheet,我推导出这个:

SELECT
        yyyy_mm, date, price, airline  -- The desired columns
    FROM
      ( SELECT  @prev := '' ) init
    JOIN
      ( SELECT  LEFT(date, 7) != @prev  AS first,
                @prev := LEFT(date, 7)
                LEFT(date, 7) AS yyyy_mm, date, price, airline
            FROM  aircrafts_in_parsed_data
            ORDER BY
                LEFT(date, 7),   -- The 'GROUP BY'
                price ASC,       -- ASC to do "MIN()"
                date             -- To get the 'first' if there are dup prices for a month
      ) x
    WHERE  first           -- extract only the first of the lowest price for each month
    ORDER BY  yyyy_mm;     -- Whatever you like

抱歉,但子查询是必要的。 (我避免使用YEAR()MONTH()DAY()。)

答案 1 :(得分:0)

你是对的,你的查询不正确。

让我们从最里面的查询开始:按pdata.date + pinfo.date分组,这样每个日期组合就会得到一个结果行。由于您没有为每个日期组合指定您感兴趣的价格或航空公司(例如MAX(airline)MIN(price)),您将获得一家任意选择的航空公司用于日期组合,并且还可以任意选择一个价格。这些甚至不必属于表中的相同记录; DBMS可以自由选择一家航空公司和一个与日期相匹配的价格。好吧,也许pdata.date和pinfo.date的日期组合已经是唯一的,但是你根本不需要分组。因此,无论我们看到这一点,这都是不合适的。

在下一个查询中,您只能通过pdata.date进行分组,从而再次获得航空公司和价格的任意匹配。您可以在最里面的查询中完成此操作。说:“给我一个随机选择的价格每pdata.date和pinfo.date,从这些给我一个随机选择的价格每pdata.date”,你也可以直接说:“给我一个随机选择每个pdata.date的价格“。然后您订购结果行。这是完全没用的,因为您再次将结果用作子查询(派生表),这被认为是无序集。因此ORDER BY为DBMS提供了更多工作要做,但绝不能保证影响主要查询结果。

在您的主查询中,您按年份和月份分组,再次导致任意选择的值。

这是相同的查询,更短更清洁:

select 
  pdata.airline,  -- some arbitrily chosen airline matching year and month
  pdata.price,    -- some arbitrily chosen price matching year and month
  pdata.date      -- some arbitrily chosen date matching year and month
from aircrafts_in_parsed_data pdata
inner join aircrafts_in_parsed_info pinfo on pdata.info_id = pinfo.id
where pinfo.aircrafts_in_id = {$id}
and pinfo.status = 'success'
and pinfo.type = 'roundtrip'
and pdata.price <> 0
group by year(pdata.date), month(pdata.date)
order by year(pdata.date) desc, month(pdata.date) desc

关于原始任务(据我所知):查找每月最低价格的记录。每月意味着GROUP BY个月。最低价格为MIN(price)

select
  min_price_record.departure_year,
  min_price_record.departure_month,
  min_price_record.min_price,
  full_record.departure_date,
  full_record.airline
from
(
  select 
    year(`date`) as departure_year, 
    month(`date`) as departure_month,
    min(price) as min_price
  from aircrafts_in_parsed_data
  where price <> 0
  and info_id in
  (
    select id
    from aircrafts_in_parsed_info
    where aircrafts_in_id = {$id}
    and status = 'success'
    and type = 'roundtrip'
  )
  group by year(`date`), month(`date`)
) min_price_record
join
(
  select 
    `date` as departure_date, 
    year(`date`) as departure_year, 
    month(`date`) as departure_month,
    price,
    airline
  from aircrafts_in_parsed_data
  where price <> 0
  and info_id in
  (
    select id
    from aircrafts_in_parsed_info
    where aircrafts_in_id = {$id}
    and status = 'success'
    and type = 'roundtrip'
  )
) full_record on full_record.departure_year = min_price_record.departure_year
              and full_record.departure_month = min_price_record.departure_month
              and full_record.price = min_price_record.min_price
order by 
  min_price_record.departure_year desc,
  min_price_record.departure_month desc;