我需要计算A,B,C,D唯一的所有ID。所以 “ A” /“ A”,“ B” /“ B”,“ C” /“ C”,“ D” /“ D”-应该给我计数A,B,C,D唯一的ID。 而“ A” /“ B”和“ B” /“ A”-将是具有Place作为A和B的ID。类似地,“ A” /“ C”和“ C” /“ A” =将是具有ID的ID位置为A和C。两个位置之间的ID重叠。每次重叠时,计数都必须不断增加。有人可以建议。我有一张下表
ID Place
1 A
2 B
1 C
6 B
4 D
5 A
6 C
7 A
8 A
8 C
能否请您指导我提出以下输出内容
A B C D
A 2 0 2 0
B 0 1 1 0
C 2 1 0 0
D 0 0 0 1
答案 0 :(得分:2)
以下是用于BigQuery标准SQL
true
您可以使用下面的问题中的虚拟数据进行测试,操作
1
结果为
#standardSQL
WITH self AS (
SELECT arr[OFFSET(0)] place, COUNT(1) cnt
FROM (
SELECT ARRAY_AGG(place) arr, id
FROM `project.dataset.table`
GROUP BY id
HAVING ARRAY_LENGTH(arr) = 1
)
GROUP BY place
), pairs AS (
SELECT id, ARRAY_AGG(place) arr
FROM `project.dataset.table`
GROUP BY id
), flat_matrix AS (
SELECT place1, place2, COUNT(DISTINCT id) cnt
FROM pairs, UNNEST(arr) place1, UNNEST(arr) place2
WHERE place1 <> place2
GROUP BY 1, 2
UNION ALL
SELECT place, place, cnt
FROM self
)
SELECT place1 place,
MAX(IF(place2 = 'A', cnt, 0)) AS A,
MAX(IF(place2 = 'B', cnt, 0)) AS B,
MAX(IF(place2 = 'C', cnt, 0)) AS C,
MAX(IF(place2 = 'D', cnt, 0)) AS D
FROM flat_matrix
答案 1 :(得分:1)
我想你基本上是想要的:
with t as (
select t.*, row_number() over (order by id) as seqnum
from t
)
select t.place,
max(case when t2.place = 'A' then 1 else 0 end) as A,
max(case when t2.place = 'B' then 1 else 0 end) as B,
max(case when t2.place = 'C' then 1 else 0 end) as C,
max(case when t2.place = 'D' then 1 else 0 end) as D
from t join
t t2
on t.id = t2.id and t.seqnum <> t2.seqnum
group by t.place
order by t.place;
这与您在问题中得到的输出不完全相同,但似乎可以从逻辑上处理重叠部分。我看不到如何将“ A” /“ A”设置为1,而将“ C” /“ C”设置为0。