mysql计算字段值出现在另一个表字段中的次数

时间:2015-07-09 22:57:41

标签: mysql sql join subquery

鉴于我有2个表,我怎么能看到有多少个不同的X值在Y的不同值中,但是在date_X之前的31天(或一个月)内?

tb1
     date_X        X
    2015-05-01    cat
    2015-05-01    pig
    2015-05-01    mouse
    2015-04-01    dog
    2015-04-01    horse
tb2  
    date_Y         Y
    2015-04-30    cat
    2015-04-02    cat
    2015-04-05    mouse
    2015-04-15    rabbit
    2015-04-10    pig
    2015-03-20    dog
    2015-03-10    horse
    2015-03-09    frog

例如,我想:

date_period num_match count_y percent_match
2015-05-01   2            4        40
2014-04-01   2            3        67

date_period是唯一的(date_x)

num_match是在给定的date_period之前最多31天匹配distinct(X)的distinct(Y)的数量

count_y是给定date_period之前最多31天的不同(Y)。

percent_match只是num_match / count_y

这个问题是我之前提出的问题的延伸: join mysql on a date range

1 个答案:

答案 0 :(得分:0)

你可以采用的方法是在日期使用非等值连接。然后你可以在集合或匹配中计算y的不同值:

select x.date_x,
       count(distinct case when x.x = y.y then y.seqnum end) as nummatch,
       count(distinct y.seqnum) as count_y,
       (count(distinct case when x.x = y.y then y.seqnum end) /
        count(distinct y.seqnum) 
       ) as ratio
from x left join
     (select y.*, rownum as seqnum
      from y
     ) y
     on y.date_y between x.date_x - 31 and x.date_x
group by x.date_x;

编辑:

以上将y中的两个“cat”行视为不同。我误读了想要的结果,所以我认为适当的查询是:

select x.date_x,
       count(distinct case when x.x = y.y then y.y end) as nummatch,
       count(distinct y.y) as count_y,
       (count(distinct case when x.x = y.y then y.y end) /
        count(distinct y.y) 
       ) as ratio
from x left join
     y
     on y.date_y between x.date_x - 31 and x.date_x
group by x.date_x;