Snowflake-使用timediff更新相关子查询

时间:2019-02-25 14:41:44

标签: sql sql-update correlated-subquery snowflake-datawarehouse snowflake

我正在Snowflake数据库上运行此查询:

UPDATE "click" c
SET "Registration_score" =
(SELECT COUNT(*) FROM "trackingpoint" t
WHERE 1=1
AND c."CookieID" = t."CookieID"
AND t."page" ilike '%Registration complete'
AND TIMEDIFF(minute,c."Timestamp",t."Timestamp") < 4320
AND TIMEDIFF(second,c."Timestamp",t."Timestamp") > 0);

数据库返回Unsupported subquery type cannot be evaluated。但是,如果我在没有最后两个条件的情况下运行它(使用TIMEDIFF),则它可以正常工作。我确认这些查询确实可以使用TIMEDIFF语句:

select count(*) from "trackingpoint"
where TIMEDIFF(minute, '2018-01-01', "Timestamp") > 604233;
select count(*) from "click"
where TIMEDIFF(minute, '2018-01-01', "Timestamp") > 604233;

,并且这些都可以正常工作。我看不到为什么TIMEDIFF条件应阻止数据库返回结果的原因。知道要改变什么才能使其正常工作吗?

1 个答案:

答案 0 :(得分:1)

因此请使用以下设置

create table click (id number, 
   timestamp timestamp_ntz,
   cookieid number,
   Registration_score number);
create table trackingpoint(id number, 
   timestamp timestamp_ntz, 
   cookieid number, 
   page text );


insert into click values (1,'2018-03-20', 101, 0),
    (2,'2019-03-20', 102, 0);
insert into trackingpoint values (1,'2018-03-20 00:00:10', 101, 'user reg comp'),
    (2,'2018-03-20 00:00:11', 102, 'user reg comp'),
    (3,'2018-03-20 00:00:13', 102, 'pet reg comp'),
    (4,'2018-03-20 00:00:15', 102, 'happy dance');

您可以看到我们得到了期望的行

select c.*, t.*
from click c
join trackingpoint t 
    on c.cookieid = t.cookieid ;

现在有两种获取计数的方法,第一种是拥有计数的,如果只计数一件事,这是个好方法,因为所有规则都是联接过滤:

select c.id,
  count(1) as new_score
from click c
join trackingpoint t 
    on c.cookieid = t.cookieid
    and t.page ilike '%reg comp'
    and TIMEDIFF(minute, c.timestamp, t.timestamp) < 4320
group by 1;

或者,您也可以(按照雪花的语法)将计数移到“合计/选择”一侧,如果这是您所需要的,则可以得到多个答案(这是我发现自己的地方,因此我提出了这个原因):

select c.id,
    sum(iff(t.page ilike '%reg comp' AND TIMEDIFF(minute, c.timestamp, t.timestamp) < 4320, 1, 0)) as new_score
from click c
join trackingpoint t 
    on c.cookieid = t.cookieid
group by 1;

因此将其插入UPDATE模式(请参阅文档中的最后一个示例) https://docs.snowflake.net/manuals/sql-reference/sql/update.html

您可以移至单个子选择,而不是雪花不支持的合并子查询,这是您收到的错误消息。

UPDATE click c
SET Registration_score = s.new_score
from (
    select ic.id,
        count(*) as new_score
    from click ic
    join trackingpoint it 
        on ic.cookieid = it.cookieid
        and it.page ilike '%reg comp'
        and TIMEDIFF(minute, ic.timestamp, it.timestamp) < 4320
    group by 1) as s
WHERE c.id = s.id; 

添加TIMEDIFF的原因将您的查询变成一个相关的子查询,即UPDATE的每一行,现在与子查询的结果相关。解决方法是制作“大而简单”的子查询并将其加入。