如何使用PostgreSQL中的排名功能将在线订单与之前的几次网站访问相关联

时间:2019-04-12 21:10:12

标签: sql postgresql ahoy

我正在创建一个数据库视图,该视图将在线订单与网站访问之前的用户相关联。这是针对电子商务网站的,因此一个用户可以访问和订购几次。

我已经加入了user_id上的visits表和orders表,并将最近的少于会话时间与订单时间相关联。现在,我希望说直到顺序1的每次访问都是“ 1”,然后再经过直到顺序2的访问都是“ 2”。另外,如果没有该特定用户的order_id,我想返回“ 0”。请参阅下面链接的屏幕截图,以供参考。

我已经尝试过使用density_rank,但是它只是对存在order_id的行进行排名。我想发扬这些职位。

SELECT v.id AS visit_id,
    v.user_id,
    v.started_at AS visit_date,
    dense_rank() OVER (PARTITION BY v.user_id ORDER BY v.started_at) AS visit_number,
    dense_rank() OVER (PARTITION BY v.user_id ORDER BY o.id) AS order_number,
    o.id AS order_id,
    o.created_at AS order_date
   FROM visits v
     FULL JOIN orders o ON v.user_id = o.user_id AND v.started_at < o.created_at AND o.created_at < (( SELECT min(visits.started_at) AS min
           FROM visits
          WHERE visits.user_id = v.user_id AND visits.started_at > v.started_at)) AND (v.started_at + '24:00:00'::interval) > o.created_at
  GROUP BY v.id, v.user_id, v.started_at, o.id, o.created_at
  ORDER BY v.started_at;

Current results Expected Results

2 个答案:

答案 0 :(得分:0)

使用import svn.local def print_commits(repo, limit=5): client = svn.local.LocalClient(repo) for commit in client.log_default(limit=limit): revision = commit.revision date = commit.date print("{}:{}".format(date, revision)) print_commits("repo") 检查上一行是否为非空,以便可以将其标记为新的组开始。设置标志后,您可以使用总和来定义组。

lag

答案 1 :(得分:0)

GROUP BY似乎是不必要的,但我将其保留。您基本上需要一个累加的总和。

我会在特定订单之前为所有访问分配订单号:

SELECT v.id AS visit_id, v.user_id,
       v.started_at AS visit_date,
       dense_rank() OVER (PARTITION BY v.user_id ORDER BY v.started_at) AS visit_number,
       dense_rank() OVER (PARTITION BY v.user_id ORDER BY o.id) AS order_number,
       o.id AS order_id,
       o.created_at AS order_date,
       count(o.id) over (partition by v.user_id order by v.started_at) as order_number
FROM visits v FULL JOIN
     orders o
     ON v.user_id = o.user_id AND
        v.started_at < o.created_at AND
        o.created_at < (SELECT min(visits.started_at)
                        FROM visits v2 
                        WHERE v2.user_id = v.user_id AND 
                              v2.started_at > v.started_at) AND
        (v.started_at + '24:00:00'::interval) > o.created_at
GROUP BY v.id, v.user_id, v.started_at, o.id, o.created_at
ORDER BY v.started_at;

我认为这是您想要的逻辑:

SELECT v.id AS visit_id, v.user_id,
       v.started_at AS visit_date,
       dense_rank() OVER (PARTITION BY v.user_id ORDER BY v.started_at) AS visit_number,
       dense_rank() OVER (PARTITION BY v.user_id ORDER BY o.id) AS order_number,
       o.id AS order_id,
       o.created_at AS order_date,
       MIN(o.order_number) OVER (PARTITION BY v.user_id ORDER BY v.started_at DESC) as order_number
FROM visits v FULL JOIN
     (SELECT o.*,
             ROW_NUMBER() OVER (PARTITION BY o.user_id ORDER BY o.id) as order_number
      FROM orders o
     ) o
     ON v.user_id = o.user_id AND
        v.started_at < o.created_at AND
        o.created_at < (SELECT min(visits.started_at)
                        FROM visits v2 
                        WHERE v2.user_id = v.user_id AND 
                              v2.started_at > v.started_at) AND
        (v.started_at + '24:00:00'::interval) > o.created_at
GROUP BY v.id, v.user_id, v.started_at, o.id, o.created_at
ORDER BY v.started_at;

但是,它可能会在您需要NULL的位置生成0