MySQL SELECT查询将历史记录随时间变为每周摘要

时间:2014-06-12 09:35:37

标签: mysql sql

我有一个历史记录表('property_histories'),用于在我们的物业管理系统中记录事件。这些事件可用于确定某个特定属性是否可供租用,而我正在尝试构建“实时”属性的(每周)摘要。

有问题的4个事件是'已发布','未发布','hidden_​​from_search'和'unhidden_​​from_search。

对于要居住的房产,必须是:

  • 公开。
  • 如果它未被发表,则随后发布的事件将是最新的。
  • 如果曾经有过hidden_​​from_search,那么后来的'unhidden_​​from_search'事件必须最近发生。

大多数属性都有一个简单的历史记录,很可能只包含一个“已发布”事件,但有些更复杂,例如:

property_histories
----------------------------
id   |   property_id |   City     |   status               |   date    
1    |   325407      |   Paris    |   published            |   2014-01-01
2    |   325407      |   Paris    |   hidden_from_search   |   2014-01-24
3    |   325407      |   Paris    |   unhidden_from_search |   2014-02-05
4    |   325407      |   Paris    |   unpublished          |   2014-02-15
5    |   410008      |   London   |   published            |   2014-01-01           
6    |   410008      |   London   |   unpublished          |   2014-01-10
7    |   410008      |   London   |   published            |   2014-01-18

我的目标是能够按周计算'实时'属性:

weekly_count
----------------------------
Year  |   Week   |   City     |   Live_Count 
2014  |   1      |   Paris    |   0      
2014  |   1      |   London   |   0
2014  |   2      |   Paris    |   1
2014  |   2      |   London   |   1
2014  |   3      |   Paris    |   1
2014  |   3      |   London   |   0
2014  |   4      |   Paris    |   1
2014  |   4      |   London   |   1
2014  |   5      |   Paris    |   0
2014  |   5      |   London   |   1
2014  |   6      |   Paris    |   0
2014  |   6      |   London   |   1
2014  |   7      |   Paris    |   1
2014  |   7      |   London   |   0
2014  |   8      |   Paris    |   0
2014  |   8      |   London   |   1
2014  |   9      |   Paris    |   0
2014  |   9      |   London   |   1
----------------------------

帮助赞赏!!

3 个答案:

答案 0 :(得分:1)

我有一种感觉,我错过了一种更简单的方法。

但是,以下查询使用2个子查询。第一个获取属性的所有已发布/未发布的范围(即,发布日期之后的最小未发布日期),而第二个对于从搜索中隐藏的属性执行相同的操作。

然后将它们连接到属性id上的属性,其中当前日期在子查询返回的范围内。然后WHERE子句检查记录是否匹配已发布而未找到隐藏子查询

不得不使用DISTINCT,否则单个取消发布的多个已发布日期将触发返回的重复属性行。

SELECT DISTINCT properties.*
FROM properties
INNER JOIN 
(
    SELECT a.property_id, a.created_at AS start_date, IFNULL(MIN(b.created_at), NOW()) AS end_date
    FROM property_histories a
    LEFT OUTER JOIN property_histories b
    ON a.property_id = b.propert_id
    AND a.created_at < b.created_at
    WHERE a.status = 'published'
    AND b.status = 'unpublished'
    GROUP BY a.property_id, a.created_at
) published
ON properties.property_id = published.property_id
AND NOW() BETWEEN published.start_date AND published.end_date
LEFT OUTER JOIN
(
    SELECT a.property_id, a.created_at AS start_date, MIN(b.created_at) AS end_date
    FROM property_histories a
    LEFT OUTER JOIN property_histories b
    ON a.property_id = b.propert_id
    AND a.created_at < b.created_at
    WHERE a.status = 'hidden_from_search'
    AND b.status = 'unhidden_from_search'
    GROUP BY a.property_id, a.created_at
) hidden
ON properties.property_id = hidden.property_id
AND NOW() BETWEEN hidden.start_date AND hidden.end_date
WHERE published.property_id IS NOT NULL
AND hidden.property_id IS NULL

答案 1 :(得分:1)

我使用数字表作为方便的快捷方式。从本质上讲,您的问题围绕着想要知道已发布或未隐藏与未发布或隐藏的运行总和。此时,纸质ID成为视图中的一个标杆点(假设它们的唯一性在其他地方得到了适当的约束),我们所需要的只是一个自定义总和。我在SQLFiddle上有一个例子。这是查询:

select years.n + 2013 as year, weeks.n as week
  , c.City
  ,
  (select
      sum(case
        when status in ('published','unhidden_from_research') then 1
        when status in ('unpublished','hidden_from_research') then -1
        else 0
      end)
    from property_histories p2
    where weekofyear(p2.date) <= weeks.n
       and p2.city=c.city
  ) AS Live_Count
from numbers weeks
  inner join numbers years on weeks.n <= 52
  cross join (select City from property_histories group by city) c
where years.n + 2013 <= (select max(year(date)) from property_histories)
group by years.n + 2013, weeks.n
  , c.City
;

答案 2 :(得分:1)

您自己的测试结果与您要求的不相符。您说实时计数是按周计算的,这意味着伦敦应该在第1周出现,因为它在第1周发布,然后在第2周未发表。

假设周日开始于星期日(默认为sql),那么这将有效。只需输入您自己的日期范围,并用您的数字表替换我的数字表。

如果您需要星期一作为开始日期,请在查询顶部使用此

SET DATEFIRST 1

模拟你的考试:

-- Create dummy data
CREATE TABLE #property_histories
(
    id int, property_id int, City varchar(50), status varchar(50), date date
)
INSERT INTO #property_histories
    SELECT 1    ,   325407      ,   'Paris'    ,   'published'            ,   '2014-01-01' UNION ALL
    SELECT 2    ,   325407      ,   'Paris'    ,   'hidden_from_search'   ,   '2014-01-24' UNION ALL
    SELECT 3    ,   325407      ,   'Paris'    ,   'unhidden_from_search' ,   '2014-02-05' UNION ALL
    SELECT 4    ,   325407      ,   'Paris'   ,   'unpublished'          ,   '2014-02-15' UNION ALL
    SELECT 5    ,   410008      ,   'London'   ,   'published'            ,   '2014-01-01' UNION ALL        
    SELECT 6    ,   410008      ,   'London'   ,   'unpublished'          ,   '2014-01-10' UNION ALL
    SELECT 7    ,   410008      ,   'London'   ,   'published'            ,   '2014-01-18' 

现在代码:

    -- TODO: Set your date range
    DECLARE @SD Datetime = '2014-01-01'
    DECLARE @ED Datetime = '2014-12-31'
    DECLARE @Wks INT = Datediff(week,@SD,@ED) -- Don't change this

    -- Generate dates table
    SELECT  NumberID as 'Week', 
            DATEADD(DAY, 1-DATEPART(WEEKDAY, DateAdd(week,NumberID-1,@SD)), DateAdd(week,NumberID-1,@SD)) as 'WeekStart', 
            DATEADD(DAY, 7-DATEPART(WEEKDAY, DateAdd(week,NumberID-1,@SD)), DateAdd(week,NumberID-1,@SD)) as 'WeekEnd'
    INTO    #Dates
    FROM    Generic.tblNumbers  -- TODO: use your own Numbers table here
    WHERE   NumberID BETWEEN 1 AND @Wks

-- Now generate report 
SELECT  T.Year, T.Week, T.City, 
        SUM(CASE    WHEN PH1.status = 'published' THEN 1
                    WHEN PH1.status = 'unhidden_from_search' THEN 1
                    ELSE 0 END) as 'Live_Count'

FROM #Dates D1
LEFT JOIN
    -- Get latest date per week
    (SELECT YEAR(D.WeekStart) as 'Year',
            D.Week,
            PH.City,
            PH.property_ID,
            MAX(PH.date) as MaxDate

        FROM    #Dates D
        LEFT JOIN   #property_histories PH
                ON  PH.date BETWEEN @SD AND D.WeekEnd
        GROUP BY D.WeekStart, D.Week, D.WeekStart, D.WeekEnd, PH.City, PH.property_id
    ) T
    ON T.Week = D1.Week

LEFT JOIN #property_histories PH1
        ON PH1.City = T.City AND PH1.property_id = T.property_id AND PH1.date = T.MaxDate

GROUP BY T.Year, T.Week, T.City

打破逻辑:首先,我创建一个包含周数,周开始和周结束日期的帮助表。周开始在很大程度上是多余的,但可能会在报告中派上用场。

然后我查询子查询以获取与每周/城市/财产相关的最新日期。对于这个&#34; max&#34;日期,城市和财产我得到了地位,如果它的生活,我总结。所以用外行人的话说;每周每个房产获得最新状态和SUM(如果有)。

与发布的其他答案不同,此解决方案可以应对数据缺口。如果记录的城市和房产的最新状态实际上一直回到第1周,那么它仍可在随后的任何一周内使用。