具有基于其他记录相似的总和列的SQL查询' (在一定范围内)

时间:2015-01-23 16:47:16

标签: sql oracle

出于某种原因,我的大脑无法理解这一点。我有一个'记录'包含列的表:id,name,genre,date,channel

我想做一个将返回表中每条记录的查询,还有另一列,它具有相同类型和日期(日期)和频道(在.0005内)的所有记录的总和作为当前记录。希望这是有道理的。我将尝试说明:

表:record

id        name        genre       date          Channel            
1         Scott       rock        2014-01-21    30.0345
2         Jim         rap         2014-01-21    55.0000
3         Dave        country     2014-01-22    23.0000
4         Tim         rock        2014-01-22    25.0000
5         Dave        rock        2014-01-21    34.0350
6         John        rock        2014-03-24    23.0000
7         Stan        rap         2013-09-16    14.0000
8         Jake        country     2014-01-21    30.0000
9         Mike        country     2014-01-22    22.9995
10        Jodi        country     2015-01-22    23.0006
11        Jodi        country     2015-01-22    23.0004

这就是我希望我的查询返回的内容:

id        name        genre       date        Channel  same_day_count
1         Scott       rock        2014-01-21  30.0345  2 
2         Jim         rap         2014-01-21  55.0000  1 
3         Dave        country     2014-01-22  23.0000  3 
4         Tim         rock        2014-01-22  25.0000  1 
5         Dave        rock        2014-01-21  30.0350  2 
6         John        rock        2014-03-24  23.0000  1 
7         Stan        rap         2013-01-21  14.0000  1 
8         Jake        country     2014-01-21  30.0000  1 
9         Mike        country     2014-01-22  22.9995  3 
10        Jodi        country     2015-01-22  23.0006  1 
11        Jodi        country     2015-01-22  23.0004  3

请帮助:对于SQL专家来说,这可能非常简单。请记住,整个时间戳不需要匹配,即小时:分钟:秒。它只需要是相同的yyyy-mm-dd

实际上在审查之后。我认为same_day_count列是不明确的。但是,频道会有所不同,并且足够独特,我相信只要实际上可以在SQL中执行此操作,它就会给出准确的结果

2 个答案:

答案 0 :(得分:1)

正如您所说的查询,您可以使用相关子查询(标准SQL,因此应该在任何数据库中工作):

select r.*,
       (select count(*)
        from records r2
        where r2.genre = r.genre and r2.date = r.date and
              abs(r2.channel - r.channel) <= 0.0005
       ) as same_day_count
from records r;

编辑:

我注意到你正在“链接”差异(第11行的值为3而不是2)。这使它更具挑战性。根据您需要的准确程度,您可以将通道四舍五入为0.001并使用该值相等。这适用于您的样本数据,但它可能不是您要查找的内容。

如果需要将0.0000,0.0004,0.0008和0.0012组合到一个组中,则需要将这些值链接在一起。您可以通过使用子查询获取每个序列的开头,执行该值的累积和,然后在此值保持不变的每个组内进行计数来执行此操作:

select r.*, count(*) over (partition by genre, date, grp)
from (select r.*,
             sum(isStart) over (partition by genre, date order by channel) as grp
      from (select r.*,
                   (case when abs(channel - lag(channel) over (partition by genre, date order by channel) ) <=  0.0005
                         then 1 else 0 
                    end) as isStart
            from records r
           ) r
     ) r

答案 1 :(得分:0)

如何使用分析函数,如下所示:

with record as (select 1 id, 'Scott' name, 'rock' genre, to_date('2014-01-21', 'yyyy-mm-dd') dt, '30.0345' Channel from dual union all
                select 2 id, 'Jim' name, 'rap' genre, to_date('2014-01-21', 'yyyy-mm-dd') dt, '55.0000' Channel from dual union all
                select 3 id, 'Dave' name, 'country' genre, to_date('2014-01-22', 'yyyy-mm-dd') dt, '23.0000' Channel from dual union all
                select 4 id, 'Tim' name, 'rock' genre, to_date('2014-01-22', 'yyyy-mm-dd') dt, '25.0000' Channel from dual union all
                select 5 id, 'Dave' name, 'rock' genre, to_date('2014-01-21', 'yyyy-mm-dd') dt, '30.0350' Channel from dual union all
                select 6 id, 'John' name, 'rock' genre, to_date('2014-03-24', 'yyyy-mm-dd') dt, '23.0000' Channel from dual union all
                select 7 id, 'Stan' name, 'rap' genre, to_date('2013-09-16', 'yyyy-mm-dd') dt, '14.0000' Channel from dual union all
                select 8 id, 'Jake' name, 'country' genre, to_date('2014-01-21', 'yyyy-mm-dd') dt, '30.0000' Channel from dual union all
                select 9 id, 'Mike' name, 'country' genre, to_date('2014-01-22', 'yyyy-mm-dd') dt, '22.9995' Channel from dual union all
                select 10 id, 'Jodi' name, 'country' genre, to_date('2015-01-22', 'yyyy-mm-dd') dt, '23.0006' Channel from dual union all
                select 11 id, 'Jodi' name, 'country' genre, to_date('2015-01-22', 'yyyy-mm-dd') dt, '23.0004' Channel from dual)
select r.*,
       to_number(channel) - .0005 ch_minus,
       to_number(channel) + .0005 ch_plus,
       count(*) over (partition by dt order by to_number(channel)
                      range between .0005 preceding and .0005 following) cnt
from   record r;

        ID NAME  GENRE   DT         CHANNEL   CH_MINUS    CH_PLUS        CNT
---------- ----- ------- ---------- ------- ---------- ---------- ----------
         7 Stan  rap     2013-09-16 14.0000    13.9995    14.0005          1
         8 Jake  country 2014-01-21 30.0000    29.9995    30.0005          1
         1 Scott rock    2014-01-21 30.0345     30.034     30.035          2
         5 Dave  rock    2014-01-21 30.0350    30.0345    30.0355          2
         2 Jim   rap     2014-01-21 55.0000    54.9995    55.0005          1
         9 Mike  country 2014-01-22 22.9995     22.999         23          2
         3 Dave  country 2014-01-22 23.0000    22.9995    23.0005          2
         4 Tim   rock    2014-01-22 25.0000    24.9995    25.0005          1
         6 John  rock    2014-03-24 23.0000    22.9995    23.0005          1
        11 Jodi  country 2015-01-22 23.0004    22.9999    23.0009          2
        10 Jodi  country 2015-01-22 23.0006    23.0001    23.0011          2

我的结果与你的结果不符,但是你的结果似乎与输入数据不匹配(例如,输入和输出数据之间id = 5的通道变化的行!)