我有一个手机所有位置的数据集。 (如果你有兴趣的话,我是通过谷歌外卖获得的。)数据的问题在于,在某个时刻,我得到了第二部手机。我拥有的数据集没有任何允许我通过特定手机跟踪数据的信息。因此,如果我在家里留下电话,那么它会立刻向我显示两个地方。我决定编写一个查询,试图通过确定最后5个中哪个点最接近消除来找到相邻点,并且我必须以超过150英里/小时的速度行进以便到达。
数据的表定义如下:
CREATE TABLE [dbo].[locationdata](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[t] [datetime] NULL,
[lat] [float] NULL,
[long] [float] NULL,
[accuracy] [smallint] NULL,
[activity] [varchar](14) NULL,
[confidence] [int] NULL,
[velocity] [varchar](2) NULL,
[altitude] [smallint] NULL,
[heading] [smallint] NULL,
[point] [geography] NULL,
[tag] [varchar](50) NULL,
CONSTRAINT [PK_locationdata] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
行按时间顺序插入,因此ID按正确的顺序排列,但多个点可以存在相同的时间。
所以这是我尝试没有CURSOR写这个。问题是你不能在公用表表达式的递归部分使用“TOP”。
WITH tripdata(originid, endid, startid, speed, distance, startpoint, startt)
AS
(
SELECT originid, endid, startid, speed, distance, startpoint, startt
FROM
(
SELECT
origin.id as originid
, NULL as endid
, origin.id as startid
, NULL as distance
, NULL as speed
, point as startpoint
, t as startt
FROM locationdata origin
) a
UNION ALL
SELECT
originid as originid
, startid as endid
, l.id as startid
, origin.startpoint.STDistance(l.point) as distance
, (origin.startpoint.STDistance(l.point)/(datediff(S, origin.startt, l.t))) * -2.23694 as speed
, l.point as startpoint
, l.t as startt
FROM tripdata origin
CROSS APPLY
(
SELECT top 1
z.id
,z.point
,z.t
FROM locationdata z
where origin.startid > z.ID and origin.startid -5 < z.ID
and z.t <> origin.startt
and (origin.startpoint.STDistance(z.point)/(datediff(S, origin.startt, z.t))) * -2.23694 < 150
order by origin.startpoint.STDistance(z.point)
) l
)
SELECT *
FROM tripdata
WHERE originid = 218255
;
我愿意接受有关如何修复此查询或甚至是否可能的建议。