我有以下查询:
SELECT
START_DATE
,ID
,USER
,ROW_NUMBER(PARTITION BY USER ORDER BY START_DATE) AS RN
FROM TABLE
带来以下结果:
START_DATE ID USER RN
2019-01-01 200 01 1
2019-01-10 450 01 2
2019-01-02 500 02 1
我只想显示具有多个开始日期(第1、2、3行)的用户,并排除仅具有row_number
为1
的用户。其次,我需要在每个DATEDIFF()
之间显示Row_Number
。
我本以为将ROW_NUMBER
包括在内,但不知道该怎么做。它可能需要一个新的解决方案。如下所示:
START_DATE USER datediff
2019-01-01 01 10
答案 0 :(得分:2)
WITH TMP AS (
SELECT
START_DATE
,ID
,USER
,LAG(START_DATE) OVER ( PARTITION BY USER ORDER BY START_DATE) AS LAST_START_DATE
,CASE WHEN MIN(START_DATE ) OVER ( PARTITION BY USER ) =
MAX(START_DATE) OVER ( PARTITION BY USER) THEN 1
ELSE 0
END ExcludeIfOnlyNonUnique
FROM TEST_DATA)
SELECT TMP.START_DATE,
TMP.ID,
TMP.USER,
DATEDIFF(TMP.START_DATE, TMP.LAST_START_DATE) START_DATE_DIFF
FROM TMP
WHERE ExcludeIfOnlyNonUnique = 0;
答案 1 :(得分:1)
有许多方法可以做到这一点,但是更优雅的方法之一是使用Microsoft SQL Server函数的领先和落后。这些函数使您可以通过方法访问分区中前一行和后一行中的值。在这里阅读它们:
领导:https://docs.microsoft.com/en-us/sql/t-sql/functions/lead-transact-sql?view=sql-server-2017
滞后:https://docs.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-2017
答案 2 :(得分:0)
我将使用CTE并两次加入查询。
WITH MY_QUERY AS (
SELECT
START_DATE,
ID,
USER,
ROW_NUMBER(PARTITION BY USER ORDER BY START_DATE) AS RN
FROM SOMETABLE
), MY_ID AS (
SELECT DISTINCT USER FROM MY_QUERY
)
SELECT Q1.START_DATE, Q1.USER, DATEDIFF(Q1.START_DATE, Q2.START_DATE) AS DIFF
FROM MY_ID ID
JOIN MY_QUERY Q1 ON ID.USER = Q1.USER AND Q1.RN = 1
JOIN MY_QUERY Q2 ON ID.USER = Q2.USER AND Q2.RN = 2
答案 3 :(得分:0)
在脚本下方尝试此操作-
WITH CTE(START_DATE, ID, [USER], RN)
AS
(
SELECT
START_DATE
,ID
,USER
,ROW_NUMBER(PARTITION BY USER ORDER BY START_DATE) AS RN
)
SELECT MIN(START_DATE) [START_DATE],
[USER],
DATEDIFF(DD,MIN(START_DATE),MAX(START_DATE))+1 [datediff]
FROM CTE
GROUP BY [USER]
HAVING MAX(RN) > 1
答案 4 :(得分:0)
您不希望import numpy as np
meanvec_obama = np.array(wordvecs_obama).mean(axis=0)
。您要row_number()
:
count()
如果仍然需要,您可以在结果集中添加select t.*
from (select . . . ,
count(*) over (partition by user) as cnt
from t
) t
where cnt > 1;
。