我有不同类别的用户,以及允许用户在多个类别中的联接表。我的连接表名为categories_users,它由user_id和category_id组成。
我想过滤category1和category2中的用户。例如,我想找到对棒球和足球都感兴趣的每个人。
在PostgreSQL中执行此操作的最佳方法是什么?我有以下工作:
select * from users
where users.id IN
( Select categories_users.user_id from categories_users
JOIN categories ON categories.id = categories_users.category_id
where categories.id = 1 OR categories.parent_id = 1)
AND users.id IN
(Select categories_users.user_id from categories_users
JOIN categories ON categories.id = categories_users.category_id
where categories.id = 2 OR categories.parent_id = 2)
然而,这感觉很笨,我想知道是否有更好的方法来做到这一点。我尝试了各种联接,但最终总是在categories_users表中搜索category_id为1和2的行,这是不可能的。
编辑,我实际上还需要搜索类别父级,因此我已将上述查询更改为包含parent_id
答案 0 :(得分:2)
只需使用同一个表连接两次(使用别名):
SELECT u.*
FROM users u
JOIN categories_users cu1 ON cu1.user_id = u.id
JOIN categories_users cu2 ON cu2.user_id = u.id
WHERE cu1.category_id = 1 AND cu2.category_id = 2
答案 1 :(得分:1)
select u.*
from
users u
inner join (
select user_id
from categories_users
group by user_id
having
bool_or(1 in (category_id, parent_id)) and
bool_or(2 in (category_id, parent_id))
) s on s.user_id = u.id
答案 2 :(得分:1)
您还可以在分区上使用COUNT(*),以查看用户在搜索类别集中有多少类别。
我创建了以下示例,以了解如何定义和参数化。
我创建了一个函数CREATE SCHEMA test;
CREATE TABLE test.categories_users (
category_id BIGINT NOT NULL,
user_id BIGINT NOT NULL
);
INSERT INTO test.categories_users
(user_id, category_id)
VALUES
(33, 103),
(34, 104),
(35, 105),
(37, 105),
(35, 106),
(37, 106);
CREATE OR REPLACE FUNCTION test.find_users_in_categories(BIGINT[])
RETURNS TABLE (
user_id BIGINT
)
AS
$$
DECLARE
categories ALIAS FOR $1;
BEGIN
RETURN QUERY
SELECT t.user_id
FROM
(
SELECT
cu.user_id,
cu.category_id,
COUNT(*) OVER (PARTITION BY cu.user_id ) AS cnt
FROM test.categories_users AS cu
WHERE cu.category_id = ANY(categories)
) AS t
WHERE t.cnt = array_length(categories, 1)
GROUP BY t.user_id;
END;
$$
LANGUAGE plpgsql;
SELECT * FROM test.find_users_in_categories(ARRAY[105, 106]);
DROP SCHEMA test CASCADE;
,它接受我们需要用户列表的类别数组。
因此,该函数将返回所有给定类别中的所有用户。
解决方案 - 获取在所有给定类别中找到的用户
CREATE SCHEMA test;
CREATE TABLE test.categories (
category_id BIGINT PRIMARY KEY,
parent_id BIGINT REFERENCES test.categories(category_id)
);
CREATE TABLE test.categories_users (
category_id BIGINT NOT NULL REFERENCES test.categories(category_id),
user_id BIGINT NOT NULL
);
INSERT INTO test.categories
(category_id, parent_id)
VALUES
(100, NULL),
(101, 100),
(102, 100),
(103, 101),
(104, 101),
(105, 101),
(106, NULL);
INSERT INTO test.categories_users
(user_id, category_id)
VALUES
(33, 103),
(34, 104),
(35, 105),
(37, 105),
(35, 106),
(37, 106);
CREATE OR REPLACE FUNCTION test.find_users_in_categories(BIGINT[])
RETURNS TABLE (
user_id BIGINT
)
AS
$$
DECLARE
main_categories ALIAS FOR $1;
BEGIN
RETURN QUERY
WITH
-- get all main categories and subcategories
RECURSIVE cte_categories (category_id, main_category_id) AS
(
SELECT cat.category_id, cat.category_id AS main_category_id
FROM test.categories AS cat
WHERE cat.category_id = ANY(main_categories)
UNION ALL
SELECT cat.category_id, cte.main_category_id
FROM cte_categories AS cte
INNER JOIN test.categories AS cat
ON cte.category_id = cat.parent_id
),
-- filter main categories that are found as children of other categories
cte_categories_unique AS
(
SELECT cte.*
FROM cte_categories AS cte
LEFT JOIN
(
SELECT category_id
FROM cte_categories
WHERE category_id <> main_category_id
GROUP BY category_id
) AS to_exclude
ON cte.main_category_id = to_exclude.category_id
WHERE to_exclude.category_id IS NULL
),
-- compute the count of main categories
cte_main_categories_count AS
(
SELECT COUNT(DISTINCT main_category_id) AS cnt
FROM cte_categories_unique
)
SELECT t.user_id
FROM
(
-- get the users which are found in each category/sub-category then group them under the main category
SELECT
cu.user_id,
cte.main_category_id
FROM test.categories_users AS cu
INNER JOIN cte_categories_unique AS cte
ON cu.category_id = cte.category_id
GROUP BY cu.user_id, cte.main_category_id
) AS t
GROUP BY t.user_id
-- filter users that do not have a match on all main categories or their sub-categories
HAVING COUNT(*) = (SELECT cnt FROM cte_main_categories_count);
END;
$$
LANGUAGE plpgsql;
SELECT * FROM test.find_users_in_categories(ARRAY[101, 106]);
DROP SCHEMA test CASCADE;
编辑 - [递归解决方案]
解决方案 - 获取在所有给定类别和子类别中找到的用户
请参阅以下有关使用JOIN +递归CTE实现解决方案的代码。我使用了JOIN而不是COUNT(),因为它看起来更适合这种情况。
# event_source is an observable of messages
# manager.leaders is an observable of leader election events
# manager.followers is an observable of leader relinquish events
event_source\
.skip_until(manager.leaders)\
.take_until(manager.followers)\
.subscribe(observer)