使用内部联接优化mysql以及在哪里

时间:2016-01-21 14:03:59

标签: mysql query-optimization mariadb

我有疑问:

SELECT DISTINCT h.id,
                h.host
FROM pozycje p
INNER JOIN hosty h ON p.host_id = h.id
INNER JOIN keywordy k ON k.id=p.key_id
AND k.bing=0
WHERE h.archive_data_checked IS NULL LIMIT 20

当存在某些行时速度很快但如果不存在结果则需要2,3 sek才能执行。我希望不到1秒。解释如下:

http://tinyurl.com/gogx42n

表pozycje有30 000 000行,hosty有4 000 000行,keywordy有40 000行。 Engine InnoDB,具有32GB RAM的服务器

如果没有结果,我可以使用哪些索引或改进来加强查询?

编辑:

show table keywordy;

 CREATE TABLE `keywordy` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `main_kw` varchar(255) CHARACTER SET utf8 NOT NULL,
 `keyword` varchar(255) CHARACTER SET utf8 NOT NULL,
 `lang` varchar(10) CHARACTER SET utf8 NOT NULL,
 `searches` int(11) NOT NULL,
 `cpc` float NOT NULL,
 `competition` float NOT NULL,
 `currency` varchar(10) CHARACTER SET utf8 NOT NULL,
 `data` date DEFAULT NULL,
 `adwords` int(11) NOT NULL,
 `monitoring` tinyint(1) NOT NULL DEFAULT '0',
 `bing` tinyint(1) NOT NULL DEFAULT '0',
 PRIMARY KEY (`id`),
 UNIQUE KEY `keyword` (`keyword`,`lang`),
 KEY `id_bing` (`id`,`bing`)
) ENGINE=InnoDB AUTO_INCREMENT=38362 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

2 个答案:

答案 0 :(得分:0)

可以测试一下:

SELECT DISTINCT h.id,
                h.host              
FROM hosty h
WHERE
    EXISTS ( SELECT 1 FROM keywordy WHERE id=p.key_id AND bing=0)
  AND
    EXISTS ( SELECT 1 FROM pozycje WHERE host_id = h.id)
  AND h.archive_data_checked IS NULL LIMIT 20

答案 1 :(得分:0)

我首先会提出以下问题。哪个会有更小的" set"如果您对

进行了查询
select count(*) from KeyWordy where bing = 0
vs
select count(*) from hosty where archive_date_checked IS NULL

然后我会尝试优化查询,知道较小的集合,并将其作为索引的主要标准。如果KeyWordy更可能是较小的集合,我会提供表格以具有以下索引

table       index
keywordy    (bing, id)   specifically NOT (id, bing) as bing FIRST is optimized for where or JOIN clause
pozycje     (key_id, host_id )
hosty       (archive_data_checked, id, host)

SELECT DISTINCT 
      h.id,
      h.host
   FROM 
      Keywordy k
         JOIN pozycje p
            ON k.id = p.key_id
            JOIN hosty h
               on archive_data_checked IS NULL
              AND p.host_id = h.id
   WHERE
      k.bing = 0
   LIMIT 
      20

如果基于archive_data_checked IS NULL的HOSTY表会更小,我提供以下内容

table       index
pozycje     (host_id, key_id )    reversed of other option

SELECT DISTINCT 
      h.id,
      h.host
   FROM 
      hosty h 
         JOIN pozycje p
            ON h.id = p.host_id
            JOIN Keywordy k
               on k.bing = 0
              AND p.key_id = k.id
   WHERE 
      h.archive_data_checked IS NULL 
   LIMIT 
      20

一个FINAL选项,可能是添加关键字" STRAIGHT_JOIN"比如

select STRAIGHT_JOIN DISTINCT ... rest of query

如果它适合您,那么它提供了什么时机改进。