我有一张桌子,我将其结构简化为下面的小桌子。
我想将下面的数据集操作为以下形式:
新数据集将包含每个DC案例的单个记录,其中带有yes / no标志,指示NatureOfTumour是否已从DC更改为IN,以及从DC更改为IN所需的时间(如果适用)。
只有在位置保持不变的情况下才会考虑从DC到IN的变化,即只有在NatureOfTumour从DC变为IN并且位置保持不变的情况下才应考虑那些记录。 ItemNo是唯一ID。
在社区成员的建议下,我也将表格粘贴在下面的文字中,尽可能地清理干净。最后一栏“Gen”是空的。 ItemNo是唯一ID。将下面的文本复制到excel并执行文本到列(以空格分隔)应该以可读格式提供原始表。对不起,想到一个更好的方法来粘贴表格。
ItemNo DateOfTest NatureOfTumour Location Centre Gen
2345 07/2006 DC P S-224
2345 12/2006 IN P S-224
2342 05/2004 DC Q B-266
3878 06/2006 DC P S-224
3878 05/2005 DC Q S-224
5678 09/2000 IN P S-224
5597 10/2001 DC P B-266
5597 01/1999 IN Q B-266
答案 0 :(得分:1)
试试这个。 LEAD函数根据DateOfTest排序的ItemNo组查看下一行。
WITH abc AS (
SELECT
ItemNo
,DateOfTest
,NatureOfTumour
,Location
,Centre
,LEAD(NatureOfTumour) OVER (PARTITION BY ItemNo ORDER BY DateOfTest) as FutureNature
,LEAD(Location) OVER (PARTITION BY ItemNo ORDER BY DateOfTest) as FutureLocation
,LEAD(DateOfTest) OVER (PARTITION BY ItemNo ORDER BY DateOfTest) as FutureDateOfTest
FROM test_results
)
SELECT
ItemNo
,DateOfTest
,NatureOfTumour
,CASE
WHEN FutureNature = 'IN'
AND FutureLocation = Location
THEN 'Yes'
ELSE 'NO'
END AS State_Change
,FutureDateOfTest - DateOfTest as Date_Diff
,Location
,Centre
from abc
WHERE NatureOfTumour = 'DC'
答案 1 :(得分:0)
您需要自我加入。这些方面的东西:
SELECT
d.ItemNo,
i.DateOfTest - d.DateOfTest AS datediff,
d.Location,
d.Centre,
d.Gen
FROM
(
SELECT
*
FROM demo
WHERE NatureOfTumour = 'DC'
) d
INNER JOIN
(
SELECT
*
FROM demo
WHERE NatureOfTumour = 'IN'
) i ON d.ItemNo = i.ItemNo
AND d.Location = i.Location;
答案 2 :(得分:0)
如果我理解你的问题,你可以试试这个: 让我知道 。 如果您只想输出更改的行(GEN =' Y'),请将LEFT JOIN更改为INNER JOIN。
SELECT A.ITEMNO, A.DATEOFTEST, A.NATUREOFTUMOUR, A.LOCATION
, CASE WHEN B.NATUREOFTUMOUR='IN' AND A.LOCATION = B.LOCATION THEN 'Y' ELSE 'N' END AS GEN_NEW
, CASE WHEN B.NATUREOFTUMOUR='IN' AND A.LOCATION = B.LOCATION THEN B.DATEOFTEST-A.DATEOFTEST END AS TIME_PASS
FROM TE A
LEFT JOIN TE B ON A.ITEMNO=B.ITEMNO AND B.NATUREOFTUMOUR<>'DC' AND A.DATEOFTEST < B.DATEOFTEST
WHERE A.NATUREOFTUMOUR='DC
或(我无法从您的问题中理解)
SELECT A.ITEMNO, A.DATEOFTEST, A.NATUREOFTUMOUR, A.LOCATION
, CASE WHEN B.NATUREOFTUMOUR='IN' THEN 'Y' ELSE 'N' END AS GEN_NEW
, CASE WHEN B.NATUREOFTUMOUR='IN' THEN B.DATEOFTEST-A.DATEOFTEST END AS TIME_PASS
FROM TE A
LEFT JOIN TE B ON A.ITEMNO=B.ITEMNO AND B.NATUREOFTUMOUR<>'DC' AND A.DATEOFTEST < B.DATEOFTEST AND A.LOCATION = B.LOCATION
WHERE A.NATUREOFTUMOUR='DC'\\
输出
ITEMNO DATEOFTEST NATUREOFTUMOUR LOCATION GEN_NEW TIME_PASS
1 2345 01.07.2006 DC P Y 153
2 2342 01.06.2006 DC Q N NULL
3 5597 01.10.2001 DC P N NULL
4 3878 01.05.2005 DC Q N NULL
5 3878 01.06.2006 DC P N NULL