我正在重写SQL,它允许用户搜索我们网站上的任何其他用户并显示他们的角色。
一个例子,角色可以是“作家”,“编辑”,“出版商”。
每个角色都将用户链接到出版物。
用户可以在多个出版物中担任多个角色。
示例表设置:
"users" : user_id, firstname, lastname
"publications" : publication_id, name
"link_writers" : user_id, publication_id
"link_editors" : user_id, publication_id
当前的伪造SQL:
SELECT * FROM (
(SELECT user_id FROM users WHERE firstname LIKE '%Jenkz%')
UNION
(SELECT user_id FROM users WHERE lastname LIKE '%Jenkz%')
) AS dt
JOIN (ROLES STATEMENT) AS roles ON roles.user_id = dt.user_id
目前我的角色陈述是:
SELECT dt2.user_id, dt2.publication_id, dt.role FROM (
(SELECT 'writer' AS role, link_writers.user_id, link_writers.publication_id
FROM link_writers)
UNION
(SELECT 'editor' AS role, link_editors.user_id, link_editors.publication_id
FROM link_editors)
) AS dt2
在UNION子句中包装roles语句的原因是某些角色更复杂,需要表连接才能找到publication_id和user_id。
例如,“发布者”可能会在两个表格之间进行链接
"link_publishers": user_id, publisher_group_id
"link_publisher_groups": publisher_group_id, publication_id
因此,在该实例中,构成我的UNION的一部分的查询将是:
SELECT 'publisher' AS role, link_publishers.user_id, link_publisher_groups.publication_id
FROM link_publishers
JOIN link_publisher_groups ON lpg.group_id = lp.group_id
我非常有信心我的桌面设置很好(在研究布局时,我被警告了一桌一桌)。我的问题是,用户表中现在有100,000行,每个链接表中有70,000行。
用户表中的初始查找速度很快,但加入确实会降低速度。
我怎样才能加入相关角色?
-------------------------- EDIT -------------------- -------------- explain http://img155.imageshack.us/img155/4758/stackusersearchjoins.gif
上面的说明(在新窗口中打开以查看完整分辨率)。
红色的底部位是“WHERE firstname LIKE'%Jenkz%'”第三行搜索WHERE CONCAT(firstname,'',lastname)LIKE'%Jenkz%'。因此大行数,但我认为这是不可避免的,除非有一种方法可以将索引放在连接的字段中?
顶部的绿色位表示从ROLES STATEMENT扫描的总行数。
然后,您可以看到每个显示大量行的UNION子句(#6 - #12)。有些索引是正常的,有些是独一无二的。
似乎MySQL没有优化使用dt.user_id作为UNION语句内部的比较。有没有办法强迫这种行为?
请注意,我的真实设置不是出版物和作家,而是“网站管理员”,“玩家”,“团队”等。
答案 0 :(得分:0)
我最初的想法是创建一个临时表来保存(和索引)与名称匹配的user_id,并使用它来连接每个链接表。不幸的是,在MySQL中,临时表只能在查询中与ONCE连接。
令人讨厌的解决方法是创建一个永久表,将connection_id添加到主键,这样单独的会话就不会混淆。
create table tt ( connection_id int not null,
user_id int not null,
firstname varchar(10) not null,
lastname varchar(10) not null,
primary key( connection_id, user_id ) );
每次需要答案时,将重复以下序列:
delete from tt where connection_id = connection_id();
insert into tt
SELECT connection_id(), user_id, firstname, lastname FROM users
WHERE firstname LIKE '%Jenkz%'
UNION
SELECT connection_id(), user_id, firstname, lastname FROM users
WHERE lastname LIKE '%Jenkz%';
接下来,扩展现有的UNION,以便只撤出相关的user_id:
SELECT 'writer' AS role, link_writers.user_id, link_writers.publication_id
FROM link_writers
JOIN tt ON tt.connection_id = connection_id() and tt.user_id = link_writers.user_id
UNION
SELECT 'editor' AS role, link_editors.user_id, link_editors.publication_id
FROM link_editors
JOIN tt ON tt.connection_id = connection_id() and tt.user_id = link_editors.user_id
UNION
SELECT 'publisher' AS role, link_publishers.user_id, link_publisher_groups.publication_id
FROM link_publishers
JOIN link_publisher_groups
ON link_publisher_groups.publisher_group_id = link_publishers.publisher_group_id
JOIN tt ON tt.connection_id = connection_id() and tt.user_id = link_publishers.user_id
也许这将是一个改进,因为并非所有链接表的每一行都被拉入联合。
EXPLAIN有点奇怪,因为只使用了tt上4个字节的索引 - 我预计所有8个字节。也许这是因为我在tt中的数据很少。
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: tt
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: link_writers
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.tt.user_id
rows: 1
Extra: Using index
*************************** 3. row ***************************
id: 2
select_type: UNION
table: tt
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra: Using index
*************************** 4. row ***************************
id: 2
select_type: UNION
table: link_editors
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.tt.user_id
rows: 1
Extra: Using index
*************************** 5. row ***************************
id: 3
select_type: UNION
table: tt
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra: Using index
*************************** 6. row ***************************
id: 3
select_type: UNION
table: link_publishers
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.tt.user_id
rows: 1
Extra: Using index
*************************** 7. row ***************************
id: 3
select_type: UNION
table: link_publisher_groups
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.link_publishers.publisher_group_id
rows: 2
Extra: Using index
*************************** 8. row ***************************
id: NULL
select_type: UNION RESULT
table: <union1,2,3>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra:
8 rows in set (0.00 sec)
答案 1 :(得分:0)
检查了OMG小马对SO - Use Of Correlated Subquery的回答后,我提出了这个问题:
SELECT * FROM (
(SELECT user_id FROM users WHERE firstname LIKE '%Jenkz%')
UNION
(SELECT user_id FROM users WHERE lastname LIKE '%Jenkz%')
) AS dt
JOIN ( SELECT 'writer' AS role, link_writers.user_id, link_writers.publication_id
FROM link_writers
UNION
SELECT 'editor' AS role, link_editors.user_id, link_editors.publication_id
FROM link_editors
UNION
SELECT 'publisher' AS role, lp.user_id, lpg.publication_id
FROM link_publishers lp
JOIN link_publisher_groups lpg ON lpg.publisher_group_id = lp.publisher_group_id
) roles on roles.user_id = dt.user_id
这个解释在我的小数据集上看起来很合理。真实的东西是什么样的?
答案 2 :(得分:0)
另一种方法是稍微违反设计规范,以便更好地支持您的查询。
为此,请创建一个新表“角色”:
create table role (
user_id int not null,
role enum ('writer', 'editor', 'publisher' ) not null,
primary key (user_id, role )
);
只要将新行添加到包含user_id的某个链接表中,就会更新此内容:
insert ignore into role values( $user_id, $role );
过了一段时间,角色条目很可能已经存在,因此“忽略”修饰符。
该表可以从现有表中启动:
insert ignore into role select distinct user_id, 'writer' from link_writers;
insert ignore into role select distinct user_id, 'editor' from link_editors;
insert ignore into role select distinct user_id, 'publisher' from link_publishers;
您的搜索查询然后变成一组简单的JOINS,MySQL应该没有问题优化:
SELECT
r.user_id,
r.role,
case r.role
when 'writer' then w.publication_id
when 'editor' then e.publication_id
when 'publisher' then pg.publication_id
end as publication_id
FROM (
(SELECT user_id FROM users WHERE firstname LIKE '%Jenkz%')
UNION
(SELECT user_id FROM users WHERE lastname LIKE '%Jenkz%')
) AS dt
JOIN role r on r.user_id = dt.user_id
LEFT JOIN link_writers w on r.user_id = w.user_id and r.role = 'writer'
LEFT JOIN link_editors e on r.user_id = e.user_id and r.role = 'editor'
LEFT JOIN link_publishers p on r.user_id = p.user_id and r.role = 'publisher'
LEFT JOIN link_publisher_groups pg on p.publisher_group_id = pg.publisher_group_id;
这将给出一个非常“宽泛”的答案。