我有关于包含位置ID和一组3 0或1个标志的位置的数据,这些标志指示位置的纬度,经度或地址是否已更改以及发生更改的月末
所以我看的是这样的事情:
+------------+-------------+--------------+---------------+---------------------+
| LOCATIONID | XCOORDHANGE | YCOORDCHANGE | ADDRESSCHANGE | REPORTPERIOD |
+------------+-------------+--------------+---------------+---------------------+
| 1 | 0 | 0 | 1 | 2010-01-31 00:00:00 |
+------------+-------------+--------------+---------------+---------------------+
| 2 | 1 | 1 | 1 | 2010-03-31 00:00:00 |
+------------+-------------+--------------+---------------+---------------------+
| 1 | 1 | 1 | 0 | 2010-08-31 00:00:00 |
+------------+-------------+--------------+---------------+---------------------+
我的任务是确定移动的位置。移动被定义为x或y坐标变化和地址变化(有时位置被重新发现并且坐标发生变化但地址不会改变,有时地址会在没有后续坐标更改的情况下更改,我对此不感兴趣这些网站)。
确定何时将所有3个标志设置为1都很容易。问题是地址和坐标变化并不总是同时发生。例如,位置1显示2010年1月31日的地址更改,但是2010年8月31日的坐标更改。我需要查看每条记录,并确定在第一次更改后的一年内是否满足“移动”标准。对于上面示例中的位置1,如果x和/或y坐标的变化从地址变化开始长达1年(也就是说,标准在彼此的1年内得到满足),我认为它是“移动” 。添加的另一个皱纹是在我正在调查的4年期间内,位置可以移动多次。我在2010年1月31日到2014年12月31日这样做。
我的第一次尝试是使用ROW_NUMBER() OVER (PARTITION BY LOCATIONID ORDER BY REPORTPERIOD ASC) as rn
并使用a.rn = a.rn+1
上的自联接将一条记录链接到另一条记录,但这会忽略多次移动的位置。
最终目标是添加一个MEETSREQ
列bit
,其中1表示该位置有一个坐标更改和地址更改,这些更改发生在彼此的1年内。
输出看起来像这样
+------------+-------------+--------------+---------------+---------------------+---------+
| LOCATIONID | XCOORDHANGE | YCOORDCHANGE | ADDRESSCHANGE | REPORTPERIOD | MEETREQ |
+------------+-------------+--------------+---------------+---------------------+---------+
| 1 | 0 | 0 | 1 | 2010-01-31 00:00:00 | 1 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 2 | 1 | 1 | 1 | 2010-03-31 00:00:00 | 1 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 1 | 1 | 1 | 0 | 2010-08-31 00:00:00 | 0 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 3 | 0 | 0 | 1 | 2011-02-28 00:00:00 | 0 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 4 | 1 | 1 | 0 | 2011-03-31 00:00:00 | 0 |
+------------+-------------+--------------+---------------+---------------------+---------+
这是SQL Server 2008 R2。感谢您的时间,我希望我已经添加了足够的清晰度。如有必要,我可以提供其他详细信息。
答案 0 :(得分:0)
你可以这样做。注意虽然它是一个“邪恶”的光标。我个人觉得,当你执行复杂的业务逻辑时,它会保持清晰。
DECLARE @LOCATIONID INT
DECLARE @XCOORDHANGE INT
DECLARE @YCOORDCHANGE INT
DECLARE @ADDRESSCHANGE INT
DECLARE @REPORTPERIOD DATETIME
CREATE TABLE #Temp1 ( LOCATIONID INT, HASMOVED BIT );
-- find all locations that have an address change
DECLARE db_cursor CURSOR FOR
SELECT LOCATIONID, XCOORDHANGE, YCOORDCHANGE, ADDRESSCHANGE, REPORTPERIOD
FROM [TABLENAME]
WHERE ADDRESSCHANGE = 1
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @LOCATIONID, @XCOORDHANGE, @YCOORDCHANGE, @ADDRESSCHANGE, @REPORTPERIOD
WHILE @@FETCH_STATUS = 0
BEGIN
-- find any other occurance of this location within the previous year, excluding any we've already looked at
-- and must have an x or y coord change
IF EXISTS(SELECT 0 FROM [TABLENAME] WHERE LOCATIONID = @LOCATIONID
AND LOCATIONID NOT IN(SELECT LOCATIONID FROM #Temp1)
AND (XCOORDHANGE = 1 OR YCOORDCHANGE = 1)
AND REPORTPERIOD > DATEADD(year, 1, @REPORTPERIOD)
)
INSERT INTO #Temp1 (LOCATIONID, HASMOVED) VALUES (@LOCATIONID, 1)
ELSE
INSERT INTO #Temp1 (LOCATIONID, HASMOVED) VALUES (@LOCATIONID, 0)
FETCH NEXT FROM db_cursor INTO @LOCATIONID, @XCOORDHANGE, @YCOORDCHANGE, @ADDRESSCHANGE, @REPORTPERIOD
END
CLOSE db_cursor
DEALLOCATE db_cursor
SELECT LOCATIONID, HASMOVED FROM #Temp1
如果您愿意,可以在现有的[TABLENAME]上加入现有的表格,包括HasMoved列。
这可能不是您指定的确切逻辑,但它应该为您提供我建议的方法的一般概念。
答案 1 :(得分:0)
由于您只关心(x或y)和地址在一段时间内不是0,我在内部查询中使用了SUM
SELECT LocationID
,SUM(Xcoord) AS x
,SUM(Ycoord) AS y
,SUM(Address) AS a
FROM myTable
WHERE Period BETWEEN '2010-01-01' AND '2010-12-31'
GROUP BY LocationID
然后在外部查询
上包含使用CASE
的列
SELECT LocationID
,(CASE WHEN (x > 0 OR y > 0) AND a > 0 THEN 1 ELSE 0 END) AS MeetsReq
FROM (
SELECT LocationID
,SUM(Xcoord) AS x
,SUM(Ycoord) AS y
,SUM(Address) AS a
FROM myTable
WHERE Period BETWEEN '2010-01-01' AND '2010-12-31'
GROUP BY LocationID
) AS isrc
然后从基表中选择,左边加入子查询 将MeetsReq的NULL值更改为0
/* This is the final query.
The 2 queries above are included here,
and was just separated for explanation purposes */
SELECT main.*, COALESCE(src.MeetsReq, 0) AS MeetsReq
FROM myTable AS main
LEFT OUTER JOIN (
SELECT LocationID
,(CASE WHEN (x > 0 OR y > 0) AND a > 0 THEN 1 ELSE 0 END) AS MeetsReq
FROM (
SELECT LocationID
,SUM(Xcoord) AS x
,SUM(Ycoord) AS y
,SUM(Address) AS a
FROM myTable
WHERE Period BETWEEN '2010-01-01' AND '2010-12-31'
GROUP BY LocationID
) AS isrc
) AS src ON main.LocationID = src.LocationID
虽然如果某个位置在MeetsReq上标记为1,那么该位置的所有记录都会相同。
答案 2 :(得分:0)
在SQL Server 2012+中,您可以使用LEAD()
执行此操作。
select t.*,
(case when lead(addresschange) over (partition by locationid order by order by reportperiod) <> addresschange and
(lead(xcoordchange) over (partition by locationid order by reportperiod) <> xcoordchange or
lead(ycoordchange) over (partition by locationid order by reportperiod) <> ycoordchange
)
then 0
else 1
end) as meetreq
from t;
在早期版本中,您可以使用outer apply
:
select t.*
(case when tnext.addresschange <> addresschange and
(tnext.xcoordchange <> xcoordchange or
tnext.ycoordchange <> ycoordchange
)
then 0
else 1
end) as meetreq
from t outer apply
(select top 1 t2.*
from t t2
where t2.locationid = t.locationid and t2.reportperiod > t.reportperiod
order by t2.reportperiod asc
) tnext;