如何在“A A ROW”中返回符合条件两天或更多天/月/季度的行?

时间:2016-12-21 20:07:44

标签: sql postgresql

我正在尝试返回阈值为> = 10 FactCount的行在行中停留两个月或更长时间。

以下是我目前拥有的输出示例,下面是查询。

我将如何实现这一目标?如果有任何我可以澄清的内容,请发表评论。感谢您的投入!

现有表格

+--------+-----------+-----------+
| UserID | YearMonth | FactCount |
+--------+-----------+-----------+
| 5454   | 201601    | 5         |
+--------+-----------+-----------+
| 5454   | 201602    | 3         |
+--------+-----------+-----------+
| 5454   | 201603    | 11        |
+--------+-----------+-----------+
| 5454   | 201604    | 10        |
+--------+-----------+-----------+
| 5454   | 201605    | 6         |
+--------+-----------+-----------+

所需输出

SELECT
    UserID
    ,YearMonth 
    ,SUM(FactCount) AS sumFact
    ,CASE WHEN  sumFact>=10 THEN 1 ELSE 0 END AS FactCount_>=10_Flag
FROM
    tbl
GROUP BY
    UserID
    ,YearMonth 

查询:

var phrase = "Taco John's is my favorite place to eat."
var matchingText = "is my favorite"

var re = new RegExp(escapeRegexCharacters(matchingText), "ig");
phrase.replace(re, "<b>$&</b>");

2 个答案:

答案 0 :(得分:1)

使用lead获取按年度排序的每个用户ID的下一行的factcount。获取当前行的值> = 10且下一行的值> = 10的所有用户标识。然后从表中选择该用户标识的所有行。

select * from tbl 
where userid in (select userid
                 from (select userid, yearmonth, factcount
                      ,lead(factcount) over(partition by userid order by yearmonth) nxt_factcount
                       from tbl) x
                 where factcount >=10 and nxt_factcount >= 10
                ) 

编辑:要将下一行视为下个月,即使缺少factcount值,您首先需要生成所有日期和用户ID组合。使用generate_seriescross join使用用户ID生成所有必需日期。然后left join将原始表格放到此处,并检查具有连续factcount&gt; = 10的用户至少一次。

with all_dates as (
SELECT dt:: date
FROM generate_series
        (date '2016-01-01' --change this series start accordingly 
        ,date '2017-12-31' --change this series end accordingly
        ,'1 month') dt
)
,all_months_count_combs as (
select  
 u.userid
,extract(year from a.dt)||'-'||extract(month from a.dt) yearmonth 
,f.factcount
,lead(f.factcount) over(partition by u.userid order by extract(year from a.dt),extract(month from a.dt)) nxt_factcount
from all_dates a
cross join (select distinct userid from foo) u
left join foo f on u.userid=f.userid
and substring(f.yearmonth,1,4)::int=extract(year from a.dt) 
and substring(f.yearmonth,5)::int=extract(month from a.dt) 
)
select * from foo 
where userid in (select distinct userid 
                 from all_months_count_combs 
                 where factcount >=10 and nxt_factcount >=10
                )

Sample Demo

答案 1 :(得分:1)

我认为这实际上就是你想要的......

PaintEventArgs

首先,为了解决e滚入Invalidate()而不是WITH t AS ( SELECT userid, make_date( substring(yearmonth::text, 1, 4)::int, substring(yearmonth::text, 5, 2)::int, 1 ) AS yearmonth, factcount FROM foo ) SELECT userid, dategroup, count(*) FROM ( SELECT userid, yearmonth, factcount, count(is_reset) OVER (PARTITION BY userid ORDER BY yearmonth) AS dategroup FROM ( SELECT userid, yearmonth, factcount, CASE WHEN (lag(yearmonth) OVER (PARTITION BY userid ORDER BY yearmonth) + '1 month'::interval)::date <> yearmonth THEN 1 END AS is_reset FROM t WHERE factcount >= 10 ) AS t2 ) AS t3 GROUP BY userid, dategroup HAVING count(*) > 1; 的问题,我们需要将这些转移到我们可以使用的内容,即201612

这里我们将201701转换为日期类型

201613

我把它放在CTE中因为

  • 您不应该存储文本date
  • 等日期
  • 这是我在进行咕噜咕噜的工作之前所做的类型修改,以解决 Island and Gaps 问题。

接下来我们从内到外工作..

yearmonth

我们在这里

  • 仅选择SELECT userid, make_date( substring(yearmonth::text, 1, 4)::int, substring(yearmonth::text, 5, 2)::int, 1 ) AS yearmonth, factcount FROM foo 行。您正在运行的统计信息会忽略其他统计信息。
  • 创建一个新的虚拟列,如果YYYYMM未成功(后来)上一行的SELECT userid, yearmonth, factcount, CASE WHEN (lag(yearmonth) OVER (PARTITION BY userid ORDER BY yearmonth) + '1 month'::interval)::date <> yearmonth THEN 1 END AS is_reset FROM t WHERE factcount >= 10 列,则返回factcount>10

这将返回一个这样的集合,

1

然后我们再次包装,datemonth我们的is_reset

datemonth

这将返回一个这样的集合,

 userid | yearmonth  | factcount | is_reset 
--------+------------+-----------+----------
   5454 | 2016-03-01 |        11 |         
   5454 | 2016-04-01 |        10 |         
   9987 | 2016-03-01 |        12 |         
   9987 | 2016-05-01 |        19 |        1

现在我们

  • 分组count()count(is_reset) OVER (PARTITION BY userid ORDER BY yearmonth) AS dategroup
  • 选择计数(*)。

这向您显示所有用户,其中 userid | yearmonth | factcount | dategroup --------+------------+-----------+----------- 5454 | 2016-03-01 | 11 | 0 5454 | 2016-04-01 | 10 | 0 9987 | 2016-03-01 | 12 | 0 9987 | 2016-05-01 | 19 | 1 的连续月份与年份无关。

userid

而且,作为一个额外的奖励,因为无论如何它必须完成工作,它会告诉你

  • 他们有多少个连续月份dategroup
  • 如果他们有其他一组超过两个有名的月份,事实数量> 10。也就是说,如果他们有1月 - 2月 - 3月和10月,10月的事实数量为11?

所以你可能会看到类似的东西,

factcount>10

但是,我认为出于您的目的,您可以使用该输出来做任何您想做的事情。 IE, userid | dategroup | count --------+-----------+------- 5454 | 0 | 2 ,然后factcount > 10到主表,如果所有用户的所有行都有两个或多个具有 userid | dategroup | count --------+-----------+------- 5454 | 0 | 3 5454 | 1 | 2 的连续月份。