MySQL查询 - 可能包含此子句吗?

时间:2011-01-15 13:55:04

标签: sql mysql

我有以下查询,它以随机顺序从某些类别中检索4个广告。

目前,如果用户的广告数超过1,则可能会检索所有这些广告 - 我需要对其进行限制,以便每个用户只展示1个广告。

这可以在同一个查询中实现吗?

SELECT      a.advert_id, a.title, a.url, a.user_id, 
            FLOOR(1 + RAND() * x.m_id) 'rand_ind' 

FROM        adverts AS a
INNER JOIN  advert_categories AS ac
ON          a.advert_id = ac.advert_id,
(
            SELECT MAX(t.advert_id) - 1 'm_id' 
            FROM adverts t
)           x

WHERE       ac.category_id IN 
(
            SELECT category_id
            FROM website_categories
            WHERE website_id = '8'
)
AND         a.advert_type = 'text'

GROUP BY    a.advert_id
ORDER BY    rand_ind 
LIMIT       4

6 个答案:

答案 0 :(得分:3)

注意:解决方案是此答案底部的最后一个查询。

测试架构和数据

create table adverts (
    advert_id int primary key, title varchar(20), url varchar(20), user_id int, advert_type varchar(10))
;
create table advert_categories (
    advert_id int, category_id int, primary key(category_id, advert_id))
;
create table website_categories (
    website_id int, category_id int, primary key(website_id, category_id))
;
insert website_categories values
    (8,1),(8,3),(8,5),
    (1,1),(2,3),(4,5)
;
insert adverts (advert_id, title, user_id) values
    (1, 'StackExchange', 1),
    (2, 'StackOverflow', 1),
    (3, 'SuperUser', 1),
    (4, 'ServerFault', 1),
    (5, 'Programming', 1),
    (6, 'C#', 2),
    (7, 'Java', 2),
    (8, 'Python', 2),
    (9, 'Perl', 2),
   (10, 'Google', 3)
;
update adverts set advert_type = 'text'
;
insert advert_categories values
    (1,1),(1,3),
    (2,3),(2,4),
    (3,1),(3,2),(3,3),(3,4),
    (4,1),
    (5,4),
    (6,1),(6,4),
    (7,2),
    (8,1),
    (9,3),
   (10,3),(10,5)
;

数据属性

  • 每个网站可以属于多个类别
  • 为简单起见,所有广告都是'text'
  • 类型
  • 每个广告可以属于多个类别。如果某个网站在advert_categories中为同一个user_id多次匹配多个类别,则会导致在下一个查询中使用3个表之间的直接连接时,advert_id显示两次。

此查询将3个表连接在一起(注意ids 1,3和10每个都出现两次)

select *
from website_categories wc
inner join advert_categories ac on wc.category_id = ac.category_id
inner join adverts a on a.advert_id = ac.advert_id and  a.advert_type = 'text'
where wc.website_id='8'
order by a.advert_id

要使每个网站只展示一次,这是显示所有符合条件的广告的核心查询,每次只展示一次

        select *
        from adverts a
        where a.advert_type = 'text'
          and exists (
            select *
            from website_categories wc
            inner join advert_categories ac on wc.category_id = ac.category_id
            where wc.website_id='8'
              and a.advert_id = ac.advert_id)

下一个查询将检索要显示的所有advert_id

select advert_id, user_id
from (
    select
        advert_id, user_id,
        @r := @r + 1 r
    from (select @r:=0) r
    cross join 
    (
        # core query -- vvv
        select a.advert_id, a.user_id
        from adverts a
        where a.advert_type = 'text'
          and exists (
            select *
            from website_categories wc
            inner join advert_categories ac on wc.category_id = ac.category_id
            where wc.website_id='8'
              and a.advert_id = ac.advert_id)
        # core query -- ^^^
        order by rand()
    ) EligibleAdsAndUserIDs
) RowNumbered
group by user_id
order by r
limit 2

此查询有3个级别

  1. 别名EligibleAdsAndUserIDs:核心查询,使用order by rand()
  2. 随机排序
  3. 别名RowNumbered:行号添加到核心查询,使用MySQL副作用@variables
  4. 最外层查询强制mysql在内部查询中随机收集行as numberedgroup by user_id使其仅保留每个user_id的第一行。只要遇到两个不同的user_id,limit 2就会导致查询停止。
  5. 这是最终查询,它从前一个查询中获取advert_id,并将其连接回表adverts以检索所需的列。

    1. 每个user_id仅
    2. 一次
    3. 功能用户的广告比例(统计上)与他们拥有的合格广告数量相比(
    4. 注意:点(2)有效,因为您拥有的广告越多,您就越有可能点击行编号子查询中的顶部位置

      select a.advert_id, a.title, a.url, a.user_id
      from
      (
          select advert_id
          from (
              select
                  advert_id, user_id,
                  @r := @r + 1 r
              from (select @r:=0) r
              cross join 
              (
                  # core query -- vvv
                  select a.advert_id, a.user_id
                  from adverts a
                  where a.advert_type = 'text'
                    and exists (
                      select *
                      from website_categories wc
                      inner join advert_categories ac on wc.category_id = ac.category_id
                      where wc.website_id='8'
                        and a.advert_id = ac.advert_id)
                  # core query -- ^^^
                  order by rand()
              ) EligibleAdsAndUserIDs
          ) RowNumbered
          group by user_id
          order by r
          limit 2
      ) Top2
      inner join adverts a on a.advert_id = Top2.advert_id;
      

答案 1 :(得分:1)

我正在思考一些事情,但没有MySQL可用..你可以尝试这个查询,看它是否有效或崩溃......

SELECT 
      PreQuery.user_id, 
      (select max( tmp.someRandom ) from PreQuery tmp where tmp.User_ID = PreQuery.User_ID ) MaxRandom
   from 
      ( select adverts.user_id,
               rand() someRandom
           from adverts, advert_categories
           where adverts.advert_id = advert_categories.advert_id ) PreQuery

如果“tmp”别名被识别为OUTER FROM子句定义的初步查询的临时缓冲区,我可能会有一些工作......我认为该字段是来自WONT查询的select语句工作,但如果确实如此,我知道我会为你提供一些可靠的东西。

答案 2 :(得分:1)

好吧,这个可能会让头部受到一点伤害,但是让我们理解事情......最内在的“核心查询”是一个基础,可以获得所有独特且随机分配的具有合格广告基础的合格用户在选择的类别上,键入='text'。由于订单是随机的,我不关心分配的顺序是什么,并按顺序排序。限制4将返回符合条件的前4个条目。这是一个用户拥有1个广告而另一个拥有1000个广告的用户。

接下来,加入广告,反转表/连接限定条件......但是通过使用WHERE - IN SUB-SELECT,子选择将在每个由“CoreQuery”限定的唯一USER ID上,将仅根据IT内部限制完成4次。因此,即使100个用户拥有不同的广告,我们也会获得4个用户。

现在,加入CoreQuery是基于相同合格用户的广告表。通常,这会将所有记录连接到核心查询,因为它们是针对相同的用户...这是正确的...但是,NEXT WHERE子句将其过滤为仅给定人员的一个广告。

Sub-Select确保其“Advert_ID”与在子选择中选择的匹配。子选择仅基于当前的“CoreQuery.user_ID”并获取用户的所有合格类别/广告(错误...我们不希望所有广告)...因此,通过添加ORDER BY RAND ()将在结果集中仅随机化这一个人的广告...然后将该限制为1将仅提供其中一个合格广告......

因此,CoreQuery限制为4个用户。然后,对于每个合格的用户ID,只获得一个合格的广告(按其内部顺序为RAND()和LIMIT 1)...

虽然我没有MySQL尝试,但查询完全合法,并希望它适合你....男人,我喜欢这样的脑筋急转弯......

SELECT
       ad1.*
   from 
       ( SELECT ad.user_id,
                count(*) as UserAdCount,
                RAND() as ANYRand
             from 
                website_categories wc
                   inner join advert_categories ac
                      ON wc.category_id = ac.category_id
                      inner join adverts ad
                         ON    ac.advert_id = ad.advert_id 
                           AND ad.advert_type = 'text'
             where
                wc.website_id = 8
             GROUP BY 
                1
             order by
                3 
             limit 
                4 ) CoreQuery,
         adverts ad1
     WHERE
             ad1.advert_type = 'text'
         AND CoreQuery.User_ID = ad1.User_ID
         AND ad1.advert_id in 
                ( select 
                          ad2.advert_id
                     FROM 
                          adverts ad2,
                          advert_categories ac2,
                          website_categories wc2
                     WHERE
                            ad2.user_id = CoreQuery.user_id
                        AND ad2.advert_id = ac2.advert_id
                        AND ac2.category_id = wc2.category_id
                        AND wc2.website_id = 8
                     ORDER BY 
                        RAND()
                     LIMIT 
                        1 )

答案 3 :(得分:1)

我想建议您随机使用php。这比在mySQL中做得快。

“然而,当表格很大(大约10,000行)时,这种选择随机行的方法随着表格的大小变得越来越慢,并且可以在服务器上产生很大的负荷。我在桌子上测试了这个工作,包含2,394,968行。花了717秒(12分钟!)返回一个随机行。“ http://www.greggdev.com/web/articles.php?id=6

答案 4 :(得分:0)

set @userid = -1;
select
  a.id,
  a.title,
  case when @userid = a.userid then
    0
  else
    1
  end as isfirst,
  (@userid := a.userid)
from
  adverts a
  inner join advertcategories ac on ac.advertid = a.advertid
  inner join categories c on c.categoryid = ac.categoryid
where
  c.website = 8
order by
  a.userid,
  rand()
having
  isfirst = 1
limit 4

答案 5 :(得分:0)

在主select指令中添加COUNT(a.user_id)并添加HAVING own<分组后的2

http://dev.mysql.com/doc/refman/5.5/en/select.html

我认为这是方法,如果一个用户有多个广告,我们就不会选择它。