我有一个人员和项目数据库。如何找到与某个人合作的人的姓名,以及有多少个项目?
例如,我想从数据库中找到Jimmy的合作者:
+----------+--------+
| project | person |
+----------+--------+
| datamax | Jimmy |
| datamax | Ashley |
| datamax | Martin |
| cocoplus | Jimmy |
| cocoplus | Ashley |
| glassbox | Jimmy |
| glassbox | Martin |
| powerbin | Jimmy |
| powerbin | Ashley |
+----------+--------+
结果看起来像这样:
Jimmy's collaborations:
+--------+----------------+
| person | collaborations |
+--------+----------------+
| Ashley | 3 |
| Martin | 2 |
+--------+----------------+
答案 0 :(得分:2)
自己加入表格,按<textarea id="textarea" name="text"
maxlength="500"></textarea>
<span id="count"></span>
字段分组:
person
查询从SELECT u2.person, COUNT(u1.project) AS collaborations
FROM users u1
JOIN users u2 ON u2.project = u1.project
WHERE u1.person != u2.person AND u1.person = 'Jimmy'
GROUP BY u2.person;
中选择Jimmy参与的项目。 u1
中的行按u2
中的行进行过滤。两个表中的用户匹配的重复条目将使用u1
子句进行过滤。最后,结果集按WHERE
分组,COUNT
函数计算每组的行数。
<强>性能强>
注意,person
和person
列(或两个单独的索引)的索引将显着提高上述查询的性能。具体的索引配置取决于表结构。虽然,我认为对于包含project
和varchar
的两个person
字段的表格来说,已经足够了
project
<强>正常化强>
但是,我宁愿将人员和项目存储在具有数字ID的单独表中。第三个表可以扮演连接器的角色:ALTER TABLE users ADD INDEX `project` (`project`(10));
ALTER TABLE users ADD INDEX `person` (`person`(10));
。换句话说,我建议normalization。使用规范化表,您不需要为文本字段构建膨胀索引。
规范化表格可能如下所示:
person_id - project_id
规范化结构的查询看起来会更复杂一些:
CREATE TABLE users (
id int unsigned NOT NULL AUTO_INCREMENT,
name varchar(200) NOT NULL DEFAULT '',
PRIMARY KEY(`id`),
-- This index is needed, if you want to fetch users by names
INDEX name (name(8))
);
CREATE TABLE projects (
id int unsigned NOT NULL AUTO_INCREMENT,
name varchar(100) NOT NULL DEFAULT '',
PRIMARY KEY(`id`)
);
CREATE TABLE collaborations (
project_id int unsigned NOT NULL DEFAULT 0,
user_id int unsigned NOT NULL DEFAULT 0,
PRIMARY KEY(`project_id`, `user_id`)
);
但它会很快,索引所需的空间会明显减少,特别是对于大型数据集。
原始回答
要获取每个人的项目总数,请使用带有-- In practice, the user ID is retrieved from the calling process
-- (such as POST/GET HTTP requests, for instance).
SET @user_id := (SELECT id FROM users WHERE name LIKE 'Jimmy');
SELECT u.name person, COUNT(p.id) collaborations
FROM collaborations c
JOIN collaborations c2 USING(project_id)
JOIN users u ON u.id = c2.user_id
JOIN projects p ON p.id = c2.project_id
WHERE c.user_id = @user_id AND c.user_id != c2.user_id
GROUP BY c2.user_id;
子句的COUNT
函数:
GROUP BY