假设我有下表:
User_ID Activity_ID
123 222
123 333
124 222
124 224
124 333
125 224
125 333
我想通过重叠的不同组合返回计数用户,例如:
Activity_ID_1 Activity_ID_2 Count_of_Users
222 333 2
222 224 2
在上面的示例中,有2个用户同时完成了223 AND 333。
我不想手动定义每个组合,因为我正在使用93个不同的activity_ids。有没有办法在Oracle SQL中完全执行此操作?
答案 0 :(得分:1)
假设您有一个带有活动ID的activity
表,并且您只想计算具有相同两个活动的DISTINCT用户(同时具有两次活动的用户将不计算):
select a1.activity_id, a2.activity_id, count(distinct f.user_id)
from activity a1 inner join facts f on a1.activity_id = f.activity_id
inner join activity a2 on a2.activity_id = f.activity_id
where a1.activity_id < a2.activity_id
group by a1.activity_id, a2.activity_id
having count(distinct f.user_id) >= 2
;
facts
是您的事实表的名称(您在问题中显示的名称)。
编辑:如果facts
表(或视图或子查询或其他)已经被user_id“区分”了,那么从我的解决方案中删除“distinct”;这将使其更有效率。注意:“distinct”出现两次,一次出现在SELECT中,再出现在HAVING中。
答案 1 :(得分:0)
Oracle安装程序:
CREATE TABLE data ( User_ID, Activity_ID ) AS
SELECT 123, 222 FROM DUAL UNION ALL
SELECT 123, 333 FROM DUAL UNION ALL
SELECT 124, 222 FROM DUAL UNION ALL
SELECT 124, 224 FROM DUAL UNION ALL
SELECT 124, 333 FROM DUAL UNION ALL
SELECT 125, 224 FROM DUAL UNION ALL
SELECT 125, 333 FROM DUAL;
CREATE TYPE INTLIST AS TABLE OF INT;
/
<强>查询强>:
WITH Activities ( User_IDs, Activity_ID ) AS (
SELECT CAST( COLLECT( User_ID ) AS INTLIST ),
Activity_ID
FROM data
GROUP BY Activity_ID
)
SELECT a.Activity_ID,
b.Activity_ID,
CARDINALITY( a.User_IDs MULTISET INTERSECT b.User_IDs ) AS "Count"
FROM Activities a
INNER JOIN
Activities b
ON ( CARDINALITY( a.User_IDs MULTISET INTERSECT b.User_IDs ) > 1
AND a.Activity_ID < b.Activity_ID );
<强>输出强>:
ACTIVITY_ID ACTIVITY_ID Count
----------- ----------- ----------
222 333 2
224 333 2