在行的其他值集中搜索一组值

时间:2013-01-03 15:02:59

标签: mysql

您好我在查询的执行时间方面遇到问题,该查询搜索具有来自一个指定兴趣集的至少一个兴趣和来自指定位置集的位置的用户(来自用户表)。所以我有这个测试DB:

    CREATE TABLE IF NOT EXISTS `interests` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(255) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;

    --
    -- Dumping data for table `interests`
    --

    INSERT INTO `interests` (`id`, `name`) VALUES
    (1, 'auto'),
    (2, 'moto'),
    (3, 'health'),
    (4, 'garden'),
    (5, 'house'),
    (6, 'music'),
    (7, 'video'),
    (8, 'games'),
    (9, 'it');

    -- --------------------------------------------------------

    --
    -- Table structure for table `locations`
    --

    CREATE TABLE IF NOT EXISTS `locations` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(50) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;

    --
    -- Dumping data for table `locations`
    --

    INSERT INTO `locations` (`id`, `name`) VALUES
    (1, 'engalnd'),
    (2, 'austia'),
    (3, 'germany'),
    (4, 'france'),
    (5, 'belgium'),
    (6, 'italy'),
    (7, 'russia'),
    (8, 'poland'),
    (9, 'norway'),
    (10, 'romania');

    -- --------------------------------------------------------

    --
    -- Table structure for table `users`
    --

    CREATE TABLE IF NOT EXISTS `users` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `email` varchar(255) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;

    --
    -- Dumping data for table `users`
    --

    INSERT INTO `users` (`id`, `email`) VALUES
    (1, 'email1@test.com'),
    (2, 'email2@test.com'),
    (3, 'email3@test.com'),
    (4, 'email4@test.com'),
    (5, 'email5@test.com'),
    (6, 'email6@test.com'),
    (7, 'email7@test.com'),
    (8, 'email8@test.com'),
    (9, 'email9@test.com'),
    (10, 'email10@test.com');

    -- --------------------------------------------------------

    --
    -- Table structure for table `users_interests`
    --

    CREATE TABLE IF NOT EXISTS `users_interests` (
      `user_id` int(11) NOT NULL,
      `interest_id` int(11) NOT NULL,
      PRIMARY KEY (`user_id`,`interest_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8;

    --
    -- Dumping data for table `users_interests`
    --

    INSERT INTO `users_interests` (`user_id`, `interest_id`) VALUES
    (1, 1),
    (1, 2),
    (2, 5),
    (2, 7),
    (2, 8),
    (3, 1),
    (4, 1),
    (4, 5),
    (4, 6),
    (4, 7),
    (4, 8),
    (5, 1),
    (5, 2),
    (5, 8),
    (6, 3),
    (6, 7),
    (6, 8),
    (7, 7),
    (7, 9),
    (8, 5);

    -- --------------------------------------------------------

    --
    -- Table structure for table `users_locations`
    --

    CREATE TABLE IF NOT EXISTS `users_locations` (
      `user_id` int(11) NOT NULL,
      `location_id` int(11) NOT NULL,
      PRIMARY KEY (`user_id`,`location_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8;

    --
    -- Dumping data for table `users_locations`
    --

    INSERT INTO `users_locations` (`user_id`, `location_id`) VALUES
    (2, 5),
    (2, 7),
    (2, 8),
    (3, 1),
    (4, 1),
    (4, 5),
    (4, 6),
    (4, 7),
    (4, 8),
    (5, 1),
    (5, 2),
    (5, 8),
    (6, 3),
    (6, 7),
    (6, 8),
    (7, 7),
    (7, 9),
    (8, 5);

是否有更好的方法来查询它:

SELECT email, 
GROUP_CONCAT( DISTINCT ui.interest_id ) AS interests, 
GROUP_CONCAT( DISTINCT ul.location_id ) AS locations
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
GROUP BY u.id
HAVING IF( interests IS NOT NULL , FIND_IN_SET( 2, interests )
OR FIND_IN_SET( 3, interests ) , 1 )
AND IF( locations IS NOT NULL , FIND_IN_SET( 2, locations )
OR FIND_IN_SET( 3, locations ) , 1 )

这是我找到的最佳解决方案,但在关系表(位置和兴趣)中的500k和1mil行上仍然很慢。特别是当您匹配大量值时(比如说50个以上的位置和兴趣)。

所以我试图实现这个查询产生的结果,但速度要快一些:

email               interests        locations

email1@test.com     1,2             [BLOB - 0B]
email5@test.com     1,2,8           1,2,8
email6@test.com     3,7,8           3,7,8
email9@test.com     [BLOB - 0B]     [BLOB - 0B]
email10@test.com    [BLOB - 0B]     [BLOB - 0B]

我还尝试加入一个SELECT UNION表 - 用于匹配集 - 但它甚至更慢。像这样:

SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id

LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is` ON ui.interest_id = is.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls` ON ul.location_id = ls.location

WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1) AND
 IF(ul.user_id IS NOT NULL, ls.location IS NOT NULL,1) 

GROUP BY u.id

我将其用于基本定位系统。 我非常感谢,任何建议!谢谢!

1 个答案:

答案 0 :(得分:1)

你有IS是mysql的保留字

并且您的group by可能会降低您的查询速度,但由于group by u.id已经u.id,我在这里看不到使用unique id的任何含义。

look demo

尝试使用它周围的反叛。

 SELECT *
 FROM `users` u
 LEFT JOIN users_interests ui ON u.id = ui.user_id
 LEFT JOIN users_locations ul ON u.id = ul.user_id

 LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is` 
     ON ui.interest_id = `is`.interest
 LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls` 
     ON ul.location_id = `ls`.location


WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1) 
 AND
 IF(ul.user_id IS NOT NULL, `ls`.location IS NOT NULL,1)