MySQL搜索用户及其角色

时间:2010-03-17 11:31:10

标签: mysql search join

我正在重写SQL,它允许用户搜索我们网站上的任何其他用户并显示他们的角色。

一个例子,角色可以是“作家”,“编辑”,“出版商”。

每个角色都将用户链接到出版物。

用户可以在多个出版物中担任多个角色。

示例表设置:

"users" : user_id, firstname, lastname
"publications" : publication_id, name  
"link_writers" : user_id, publication_id  
"link_editors" : user_id, publication_id  

当前的伪造SQL:

SELECT * FROM (
  (SELECT user_id FROM users WHERE firstname LIKE '%Jenkz%') 
  UNION 
  (SELECT user_id FROM users WHERE lastname LIKE '%Jenkz%')
) AS dt
JOIN (ROLES STATEMENT) AS roles ON roles.user_id = dt.user_id

目前我的角色陈述是:

SELECT  dt2.user_id, dt2.publication_id, dt.role FROM (
  (SELECT 'writer' AS role, link_writers.user_id, link_writers.publication_id
  FROM link_writers)
  UNION
  (SELECT 'editor' AS role, link_editors.user_id, link_editors.publication_id
  FROM link_editors)
) AS dt2

在UNION子句中包装roles语句的原因是某些角色更复杂,需要表连接才能找到publication_id和user_id。

例如,“发布者”可能会在两个表格之间进行链接

"link_publishers": user_id, publisher_group_id
"link_publisher_groups": publisher_group_id, publication_id

因此,在该实例中,构成我的UNION的一部分的查询将是:

SELECT 'publisher' AS role, link_publishers.user_id, link_publisher_groups.publication_id
FROM link_publishers
JOIN link_publisher_groups ON lpg.group_id = lp.group_id

我非常有信心我的桌面设置很好(在研究布局时,我被警告了一桌一桌)。我的问题是,用户表中现在有100,000行,每个链接表中有70,000行。

用户表中的初始查找速度很快,但加入确实会降低速度。

我怎样才能加入相关角色?

-------------------------- EDIT -------------------- -------------- explain http://img155.imageshack.us/img155/4758/stackusersearchjoins.gif

上面的说明(在新窗口中打开以查看完整分辨率)。

红色的底部位是“WHERE firstname LIKE'%Jenkz%'”第三行搜索WHERE CONCAT(firstname,'',lastname)LIKE'%Jenkz%'。因此大行数,但我认为这是不可避免的,除非有一种方法可以将索引放在连接的字段中?

顶部的绿色位表示从ROLES STATEMENT扫描的总行数。

然后,您可以看到每个显示大量行的UNION子句(#6 - #12)。有些索引是正常的,有些是独一无二的。

似乎MySQL没有优化使用dt.user_id作为UNION语句内部的比较。有没有办法强迫这种行为?

请注意,我的真实设置不是出版物和作家,而是“网站管理员”,“玩家”,“团队”等。

3 个答案:

答案 0 :(得分:0)

我最初的想法是创建一个临时表来保存(和索引)与名称匹配的user_id,并使用它来连接每个链接表。不幸的是,在MySQL中,临时表只能在查询中与ONCE连接。

令人讨厌的解决方法是创建一个永久表,将connection_id添加到主键,这样单独的会话就不会混淆。

create table tt ( connection_id int not null,
                  user_id int not null, 
                  firstname varchar(10) not null, 
                  lastname varchar(10) not null,
                  primary key( connection_id, user_id ) );

每次需要答案时,将重复以下序列:

delete from tt where connection_id = connection_id();

insert into tt 
  SELECT connection_id(), user_id, firstname, lastname FROM users 
  WHERE firstname LIKE '%Jenkz%' 
  UNION 
  SELECT connection_id(), user_id, firstname, lastname FROM users 
  WHERE lastname LIKE '%Jenkz%';

接下来,扩展现有的UNION,以便只撤出相关的user_id:

SELECT 'writer' AS role, link_writers.user_id, link_writers.publication_id
FROM link_writers
JOIN tt ON tt.connection_id = connection_id() and tt.user_id = link_writers.user_id

UNION

SELECT 'editor' AS role, link_editors.user_id, link_editors.publication_id
FROM link_editors
JOIN tt ON tt.connection_id = connection_id() and tt.user_id = link_editors.user_id

UNION

SELECT 'publisher' AS role, link_publishers.user_id, link_publisher_groups.publication_id
FROM link_publishers
JOIN link_publisher_groups 
   ON link_publisher_groups.publisher_group_id = link_publishers.publisher_group_id
JOIN tt ON tt.connection_id = connection_id() and tt.user_id = link_publishers.user_id

也许这将是一个改进,因为并非所有链接表的每一行都被拉入联合。

EXPLAIN有点奇怪,因为只使用了tt上4个字节的索引 - 我预计所有8个字节。也许这是因为我在tt中的数据很少。

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: tt
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: link_writers
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.tt.user_id
         rows: 1
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: UNION
        table: tt
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
        Extra: Using index
*************************** 4. row ***************************
           id: 2
  select_type: UNION
        table: link_editors
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.tt.user_id
         rows: 1
        Extra: Using index
*************************** 5. row ***************************
           id: 3
  select_type: UNION
        table: tt
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
        Extra: Using index
*************************** 6. row ***************************
           id: 3
  select_type: UNION
        table: link_publishers
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.tt.user_id
         rows: 1
        Extra: Using index
*************************** 7. row ***************************
           id: 3
  select_type: UNION
        table: link_publisher_groups
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.link_publishers.publisher_group_id
         rows: 2
        Extra: Using index
*************************** 8. row ***************************
           id: NULL
  select_type: UNION RESULT
        table: <union1,2,3>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
        Extra:
8 rows in set (0.00 sec)

答案 1 :(得分:0)

检查了OMG小马对SO - Use Of Correlated Subquery的回答后,我提出了这个问题:

SELECT * FROM (
  (SELECT user_id FROM users WHERE firstname LIKE '%Jenkz%') 
  UNION 
  (SELECT user_id FROM users WHERE lastname LIKE '%Jenkz%')
) AS dt
JOIN ( SELECT 'writer' AS role, link_writers.user_id, link_writers.publication_id
       FROM link_writers
       UNION
       SELECT 'editor' AS role, link_editors.user_id, link_editors.publication_id
       FROM link_editors
       UNION
       SELECT 'publisher' AS role, lp.user_id, lpg.publication_id
       FROM link_publishers lp
       JOIN link_publisher_groups lpg ON lpg.publisher_group_id = lp.publisher_group_id
     ) roles on roles.user_id = dt.user_id

这个解释在我的小数据集上看起来很合理。真实的东西是什么样的?

答案 2 :(得分:0)

另一种方法是稍微违反设计规范,以便更好地支持您的查询。

为此,请创建一个新表“角色”:

create table role (
     user_id int not null,
     role enum ('writer', 'editor', 'publisher' ) not null,
     primary key (user_id, role )
);

只要将新行添加到包含user_id的某个链接表中,就会更新此内容:

insert ignore into role values( $user_id, $role );

过了一段时间,角色条目很可能已经存在,因此“忽略”修饰符。

该表可以从现有表中启动:

insert ignore into role select distinct user_id, 'writer' from link_writers;
insert ignore into role select distinct user_id, 'editor' from link_editors;
insert ignore into role select distinct user_id, 'publisher' from link_publishers;

您的搜索查询然后变成一组简单的JOINS,MySQL应该没有问题优化:

SELECT 
   r.user_id, 
   r.role,
   case r.role 
        when 'writer' then w.publication_id
        when 'editor' then e.publication_id
        when 'publisher' then pg.publication_id
        end as publication_id
FROM (
  (SELECT user_id FROM users WHERE firstname LIKE '%Jenkz%') 
  UNION 
  (SELECT user_id FROM users WHERE lastname LIKE '%Jenkz%')
) AS dt
JOIN role r on r.user_id = dt.user_id
LEFT JOIN link_writers w on r.user_id = w.user_id and r.role = 'writer'
LEFT JOIN link_editors e on r.user_id = e.user_id and r.role = 'editor'
LEFT JOIN link_publishers p on r.user_id = p.user_id and r.role = 'publisher'
LEFT JOIN link_publisher_groups pg on p.publisher_group_id = pg.publisher_group_id;

这将给出一个非常“宽泛”的答案。