MysqL大表查询优化

时间:2016-01-09 21:40:17

标签: mysql large-data

我有一个聊天应用程序。我有一个api,它返回用户交谈的用户列表。但是当它达到100000行数据时,mysql返回列表消息需要很长时间。 这是我的留言表

CREATE TABLE IF NOT EXISTS `messages` (
  `_id` int(11) NOT NULL AUTO_INCREMENT,
  `fromid` int(11) NOT NULL,
  `toid` int(11) NOT NULL,
  `message` text NOT NULL,
  `attachments` text NOT NULL,
  `status` tinyint(1) NOT NULL DEFAULT '0',
  `date` datetime NOT NULL,
  `delete` varchar(50) NOT NULL,
  `uuid_read` varchar(250) NOT NULL,
  PRIMARY KEY (`_id`),
  KEY `fromid` (`fromid`,`toid`,`status`,`delete`,`uuid_read`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=118561 ;

这是我的用户表(简化)

CREATE TABLE IF NOT EXISTS `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `login` varchar(50) DEFAULT '',
  `sex` tinyint(1) DEFAULT '0',
  `status` varchar(255) DEFAULT '',
  `avatar` varchar(30) DEFAULT '0',
  `last_active` datetime DEFAULT NULL,
  `active` tinyint(1) DEFAULT '1',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=15523 ;

这是我的查询(对于ID为1930的用户)

select SQL_CALC_FOUND_ROWS `u_id`, `id`, `login`, `sex`, `birthdate`, `avatar`, `online_status`, SUM(`count`) as `count`, SUM(`nr_count`) as `nr_count`, `date`, `last_mesg` from
(
(select `m`.`fromid` as `u_id`, `u`.`id`, `u`.`login`, `u`.`sex`, `u`.`birthdate`, `u`.`avatar`, `u`.`last_active` as online_status, COUNT(`m`.`_id`) as `count`, (COUNT(`m`.`_id`)-SUM(`m`.`status`)) as `nr_count`, `tm`.`date` as `date`, `tm`.`message` as `last_mesg` from `messages` as m inner join `messages` as tm on `tm`.`_id`=(select MAX(`_id`) from `messages` as `tmz` where `tmz`.`fromid`=`m`.`fromid`) left join `users` as u on `u`.`id`=`m`.`fromid` where `m`.`toid`=1930 and `m`.`delete` not like '%1930;%' group by `u`.`id`)
UNION
(select `m`.toid as `u_id`, `u`.`id`, `u`.`login`, `u`.`sex`, `u`.`birthdate`, `u`.`avatar`, `u`.`last_active` as online_status, COUNT(`m`.`_id`) as `count`, 0 as `nr_count`, `tm`.`date` as `date`, `tm`.`message` as `last_mesg` from `messages` as m inner join `messages` as tm on `tm`.`_id`=(select MAX(`_id`) from `messages` as `tmz` where `tmz`.`toid`=`m`.`toid`) left join `users` as u on `u`.`id`=`m`.`toid` where `m`.`fromid`=1930 and `m`.`delete` not like '%1930;%' group by `u`.`id`)
order by `date` desc ) as `f` group by `u_id` order by `date` desc limit 0,10

请帮助优化此查询

我需要什么, 谁与用户交谈(姓名,性别等) 什么是最后的消息(来自我或我) 消息数量(全部) 未读消息的数量(仅限我)

查询效果很好,但耗时太长。

输出必须像这样

enter image description here

1 个答案:

答案 0 :(得分:1)

您的查询和数据库存在一些设计问题。

  • 您应该避免使用关键字作为列名,例如delete列或count列;
  • 如果没有聚合功能,您应该避免选择未在group by中声明的列...虽然MySQL允许这样做,但它不是标准,您无法控制哪些数据将被选中;
  • 您的not like构造可能会导致查询出现错误行为,因为'%1930;%'可能与11930;匹配且11930不等于1930;
  • 您应该避免以like通配符开头和结尾的%构造,这将导致文本处理花费更长时间;
  • 您应该设计一种更好的方式来表示消息删除,可能是更好的标记和/或另一个表来保存与该操作相关的任何重要数据;
  • 在加入条件(使用派生表)之前尝试limit您的结果以执行更少的处理;

我尝试以我理解的最佳方式重写您的查询。我在一个消息表中执行了我的查询,其行数约为200,000,没有索引,并且在0,15秒内执行。但是,当然,您应该创建正确的索引,以便在数据量增加时帮助它更好地执行。

SELECT SQL_CALC_FOUND_ROWS 
  u.id, 
  u.login, 
  u.sex, 
  u.birthdate, 
  u.avatar, 
  u.last_active AS online_status, 
  g._count, 
  CASE WHEN m.toid = 1930 
    THEN g.nr_count 
    ELSE 0 
  END AS nr_count, 
  m.`date`, 
  m.message AS last_mesg 
FROM
(

  SELECT 
    MAX(_id) AS _id, 
    COUNT(*) AS _count, 
    COUNT(*) - SUM(m.status) AS nr_count
  FROM messages m
  WHERE 1=1
    AND m.`delete` NOT LIKE '%1930;%' 
    AND
    (0=1
      OR m.fromid = 1930 
      OR m.toid   = 1930
    )
  GROUP BY 
    CASE WHEN m.fromid = 1930 
      THEN m.toid 
      ELSE m.fromid 
    END
  ORDER BY MAX(`date`) DESC
  LIMIT 0, 10
) g
INNER JOIN messages AS m ON 1=1 
  AND m._id = g._id
LEFT JOIN users AS u ON 0=1 
  OR (m.fromid <> 1930 AND u.id = m.fromid)
  OR (m.toid   <> 1930 AND u.id = m.toid)
ORDER BY m.`date` DESC
;