Question

我在Snowflake中，试图标记列中首次出现的唯一ID。我一直在与first_value玩耍，但并没有真正到达任何地方。

所以我的数据看起来像这样：

理想情况下，我想要这样的东西：

ID Date    First?
123 1/2019 1
123 2/2019 0
123 3/2019 0
234 2/2019 1
234 3/2019 0

我如何做到这一点？

Answer 1

您要ROW_NUMBER：

SELECT 
   ID, 
   Date, 
   IFF(ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Date) = 1, 1, 0) AS First
FROM 
   schema.table
ORDER BY ID, Date
;

这将检查当前行是否是ID的第一个日期，如果是，则将其赋值为1（否则为0）。

Answer 2

如果您打算检索列中首次出现的唯一ID，则row_number（）或density_rank（）函数可以为您提供帮助。

with cte as
(
select ID, Date,
row_number() over (partition by ID order by date) as row_number
from table1
)
select * from cte where row_number = 1;

with cte as
(
select ID, Date,
dense_rank() over (partition by ID order by date) as rank
from stack1
)
select * from cte where rank = 1;

Answer 3

LAG也可以用来解决这个问题。

actual PC installed OS architecture

也可以使用FIRST_VALUE来完成

SELECT id
    ,date
    ,lag(id) over (partition by id order by date) is null as first
FROM table_name;