自加入表格

时间:2019-03-15 15:51:53

标签: sql google-bigquery standard-sql

我正试图将一个表与其自身连接,以从某个(稍后)具有goal_completion的clientids中获取所有session_ids。

表格:

<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>clientid</th><th>sessionid</th><th>goalcompletion</th></tr></thead><tbody>
 <tr><td>1</td><td>a</td><td>0</td></tr>
 <tr><td>1</td><td>b</td><td>0</td></tr>
 <tr><td>1</td><td>c</td><td>1</td></tr>
 <tr><td>2</td><td>x</td><td>0</td></tr>
 <tr><td>2</td><td>y</td><td>0</td></tr>
 <tr><td>2</td><td>z</td><td>0</td></tr>
</tbody></table>

预期输出:

<table class="tableizer-table">
<thead><tr class="tableizer-firstrow"><th>clientid</th><th>sessionid</th></tr></thead><tbody>
 <tr><td>1</td><td>a</td></tr>
 <tr><td>1</td><td>b</td></tr>
 <tr><td>1</td><td>c</td></tr>
</tbody></table>

我尝试了几个版本,但似乎无法弄清楚它是如何工作的。这是我最新的迭代:

SELECT a.clientid,
       a.session_id,
       a.goal1completions_funnel,
       a.goal2completions_funnel,
       a.goal3completions_funnel
FROM _demo.ga_conversions_test a
left JOIN _demo.ga_conversions_test b
  ON a.session_id = b.session_id
  AND (b.goal3completions_funnel = 1
     OR b.goal1completions_funnel = 1
     OR b.goal2completions_funnel = 1)

您能以正确的方式引导我吗?

3 个答案:

答案 0 :(得分:0)

我认为您不需要为此任务使用联接。请尝试:

select session_id
from _demo.ga_conversions_test 
where goal1completions_funnel=1 OR goal2completions_funnel=1 OR goal3completions_funnel=1
group by session_id

编辑:根据提供的数据,如果我是您,我会发现具有超过两种类型的目标完成编号的clientid并使用如下所示的内部联接:

with cte as (
   select clientid, count(distinct goalcompletion) as CountDifferentGoalCompletion
   from _demo.ga_conversions_test
   group by clientid
   having count(distinct goalcompletion) > 1
)
select a.clientid, a.sessionid
from _demo.ga_conversions_test a
inner join cte on cte.clientid = _demo.ga_conversions_test.clientid

EDIT2:如果cte结构不起作用(如果不在SQL Server中,则不会起作用),然后:

select a.clientid, a.sessionid
from _demo.ga_conversions_test a
inner join (select clientid, count(distinct goalcompletion) as CountDifferentGoalCompletion
            from _demo.ga_conversions_test
            group by clientid
            having count(distinct goalcompletion) > 1) x 
on x.clientid = a.clientid

答案 1 :(得分:0)

似乎无法获得预期结果的原因是您使用相同的session_id连接表。试试这个:

Select ....
From _demo.ga_conversions_test a
left join _demo.ga_conversions_test a on a.session_id < b.session_id (and other criteria)

查看是否有效。如果不是这样,请发布您的数据结构示例,以帮助我们更好地理解。如果给您解决方案,请将我的回答标记为可接受的解决方案。

答案 2 :(得分:0)

这是您想要的吗?

select ct.*
from _demo.ga_conversions_test ct
where exists (select 1
              from _demo.ga_conversions_test ct2
              where ct2.session_id = ct.session_id and
                    1 in (goal1completions_funnel, goal2completions_funnel, goal3completions_funnel) and
                    ct2.<timecol> > ct.<ctimecol>
             );