Google Big Query中的数据透视表

时间:2018-07-13 16:41:36

标签: sql google-bigquery

我需要计算A,B,C,D唯一的所有ID。所以 “ A” /“ A”,“ B” /“ B”,“ C” /“ C”,“ D” /“ D”-应该给我计数A,B,C,D唯一的ID。 而“ A” /“ B”和“ B” /“ A”-将是具有Place作为A和B的ID。类似地,“ A” /“ C”和“ C” /“ A” =将是具有ID的ID位置为A和C。两个位置之间的ID重叠。每次重叠时,计数都必须不断增加。有人可以建议。我有一张下表

ID     Place
1       A
2       B
1       C
6       B
4       D
5       A
6       C
7       A
8       A
8       C

能否请您指导我提出以下输出内容

   A B C D
A  2 0 2 0
B  0 1 1 0
C  2 1 0 0
D  0 0 0 1

2 个答案:

答案 0 :(得分:2)

以下是用于BigQuery标准SQL

true

您可以使用下面的问题中的虚拟数据进行测试,操作

1

结果为

#standardSQL
WITH self AS (
  SELECT arr[OFFSET(0)] place, COUNT(1) cnt
  FROM (
    SELECT ARRAY_AGG(place) arr, id
    FROM `project.dataset.table`
    GROUP BY id
    HAVING ARRAY_LENGTH(arr) = 1
  )
  GROUP BY place
), pairs AS (
  SELECT id, ARRAY_AGG(place) arr
  FROM `project.dataset.table` 
  GROUP BY id
), flat_matrix AS (
  SELECT place1, place2, COUNT(DISTINCT id) cnt
  FROM pairs, UNNEST(arr) place1, UNNEST(arr) place2
  WHERE place1 <> place2
  GROUP BY 1, 2
  UNION ALL
  SELECT place, place, cnt
  FROM self
)
SELECT place1 place,
  MAX(IF(place2 = 'A', cnt, 0)) AS A,
  MAX(IF(place2 = 'B', cnt, 0)) AS B,
  MAX(IF(place2 = 'C', cnt, 0)) AS C,
  MAX(IF(place2 = 'D', cnt, 0)) AS D 
FROM flat_matrix    

答案 1 :(得分:1)

我想你基本上是想要的:

with t as (
      select t.*, row_number() over (order by id) as seqnum
      from t
     )
select t.place,
       max(case when t2.place = 'A' then 1 else 0 end) as A,
       max(case when t2.place = 'B' then 1 else 0 end) as B,
       max(case when t2.place = 'C' then 1 else 0 end) as C,
       max(case when t2.place = 'D' then 1 else 0 end) as D
from t join
     t t2
     on t.id = t2.id and t.seqnum <> t2.seqnum
group by t.place
order by t.place;

这与您在问题中得到的输出不完全相同,但似乎可以从逻辑上处理重叠部分。我看不到如何将“ A” /“ A”设置为1,而将“ C” /“ C”设置为0。