用于运行计数的SQL,包括开始日期,结束日期

时间:2017-05-14 13:54:31

标签: sql sql-server tsql pyspark hiveql

我有以下表格数据。

CREATE TABLE [dbo].[Accounts](
    [AccountID] [int] IDENTITY(1,1) NOT NULL,
    [StartDate] [datetime] NULL,
    [EndDate] [datetime] NULL
)

INSERT INTO [dbo].[Accounts]
       ([StartDate]
       ,[EndDate])
 VALUES
      ('01/01/2012'  ,'02/01/2012'),
      ('01/06/2012' ,'07/01/2012'),
      ('01/08/2012'  ,'11/01/2012'),
      ('01/11/2012','01/01/2013'),
      ('02/07/2012' ,'01/01/2013'),
      ('04/01/2012' ,'01/01/2013'),
      ('06/01/2012'  ,'01/01/2013'),
      ('09/01/2012' ,'01/01/2013'),
      ('11/01/2012' ,'01/01/2013'),
      ('12/01/2012' ,'01/01/2014'),
      ('01/01/2013'  ,'02/01/2014'),
      ('01/06/2013' ,'07/01/2014'),
      ('01/08/2013'  ,'11/01/2014'),
      ('01/11/2013','01/01/2014'),
      ('02/07/2013' ,'01/01/2014'),
      ('04/01/2013' ,'01/01/2014'),
      ('06/01/2013'  ,'01/01/2014'),
      ('09/01/2013' ,'01/01/2014'),
      ('11/01/2013' ,'01/01/2014'),
      ('12/01/2013' ,'01/01/2015'),
      ('01/01/2014'  ,'02/01/2015'),
      ('01/06/2014' ,'07/01/2015'),
      ('01/08/2014'  ,'11/01/2015'),
      ('01/11/2014','01/01/2015'),
      ('02/07/2014' ,'01/01/2015'),
      ('04/01/2014' ,'01/01/2015'),
      ('06/01/2014'  ,'01/01/2015'),
      ('09/01/2014' ,'01/01/2015'),
      ('11/01/2014' ,'01/01/2015'),
      ('12/01/2014' ,'01/01/2015'),
      ('01/01/2014'  ,'02/01/2015'),
      ('01/06/2014' ,'07/01/2015'),
      ('01/08/2014'  ,'11/01/2015'),
      ('01/11/2014','01/01/2015'),
      ('02/07/2014' ,'01/01/2015'),
      ('04/01/2013' ,'01/01/2014'),
      ('06/01/2013'  ,'01/01/2014'),
      ('09/01/2013' ,'01/01/2014'),
      ('11/01/2013' ,'01/01/2014'),
      ('12/01/2013' ,'01/01/2015')


SELECT datename(month, [StartDate])+' '+datename(year, [StartDate]) AS 'Start Date',
       COUNT([AccountID]) AS 'Accounts by Month'
FROM [dbo].[Accounts]
GROUP BY datename(month, [StartDate])+' '+datename(year, [StartDate]);

enter image description here

我需要两个查询来获取按“年 - 月”分组的运行计数。结果将有两列“月 - 年”和“运行帐户数”。

  1. 按StartDate(年,月级别)分组的帐户的运行次数
  2. 考虑EndDate的帐户的运行次数,意味着任何EndDate超过StartDate的帐户不应计入运行计数,再次按StartDate(年,月级别)分组
  3. 我想在T-SQL,HiveQL中使用解决方案,甚至使用PySpark DataFrames API。

0 个答案:

没有答案