使用SUM

时间:2016-02-02 19:52:49

标签: sql postgresql

我的数据库模式的相关部分看起来像这样(Ruby on Rails迁移代码,但应该易于阅读):

create_table "team_memberships" do |t|
  t.integer  "team_id"
  t.integer  "user_id"
end

create_table "users" do |t|
  t.integer "id"
  t.string  "slug"
end

create_table "performance_points" do |t|
  t.integer "id"
  t.integer "user_id",
  t.date    "date",
  t.integer "points",
  t.integer "team_id"
end

我想要一个查询,该查询返回按特定日期以来收到的效果点总数排序的用户列表。注意一个" performance_points"行不等于一点,我们需要总结"点"

到目前为止我的查询看起来像这样:

SELECT u.id, u.slug, SUM(pp.points) AS total
FROM users u
JOIN performance_points pp ON pp.user_id = u.id
JOIN team_memberships tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id
WHERE (pp.date > '2015-08-02 13:57:14.042221')
GROUP BY pp.id, u.id
ORDER BY total DESC
LIMIT 50

前三个结果是:

"id","slug","total"
32369,"andreas-jensen-9de10dec-0f88-427f-b135-62cebea611c8",245
23752,"kenneth-kjaerstad",95
34179,"marius-mork-rydal",93

要检查结果是否正确,我会计算每个用户的积分。然而,第二个似乎是错误的。我使用Kenneth的id:

运行此查询
SELECT SUM(performance_points.points)
FROM performance_points
WHERE performance_points.user_id = 23752
  AND (date > '2015-08-02 13:57:14.042221')

我得到:84。通过以下方式查看Kenneth的所有表现点:

SELECT performance_points.points
FROM performance_points
WHERE performance_points.user_id = 23752
  AND (date > '2015-08-02 13:57:14.042221')

我们得到:

"points"
-10
1
-2
95

-10 + 1 - 2 + 95确实是84所以我不知道第一次查询是怎么回事。为什么总共95?

我正在运行PostgreSQL版本9.3.5

5 个答案:

答案 0 :(得分:2)

If slug is unique per user:

SELECT u.id, u.slug, SUM(pp.points) AS total
FROM users u
JOIN performance_points pp
ON u.id = pp.user_id
WHERE pp.date > '2015-08-02 13:57:14.042221'
GROUP BY u.id, u.slug
ORDER BY total DESC
LIMIT 50

Otherwise you can't SELECT slug because it's not a grouping column, so there are multiple values of it in each group. You want to GROUP BY user_id in performance_points to get total per user_id then JOIN with users to get slugs.

SELECT id, slug, total
FROM users
JOIN (
    SELECT user_id, SUM(points) AS total
    FROM performance_points
    WHERE date > '2015-08-02 13:57:14.042221'
    GROUP BY user_id) t
ON id = user_id
ORDER BY total DESC
LIMIT 50

(It's not clear why you are JOINing with team_membership. Presumably performance_points (user_id,team_id) is a foreign key into it, ie all such pairs are already in it.)

答案 1 :(得分:2)

I took your query and added a filter to limited to a single user. You should now see four rows for user kenneth-kjaerstad:

SELECT u.id, u.slug, SUM(pp.points) AS total
FROM
    users u
    JOIN performance_points pp ON pp.user_id = u.id
    JOIN team_memberships tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id
WHERE pp.date > '2015-08-02 13:57:14.042221' and u.id = 23752
GROUP BY pp.id, u.id

The problem was that the sort pushed all the other rows way down this list and you never saw that there were three others for him besides the one at the top of the ranking.

The reason is that your grouping is wrong as you just want a total per user. pp.id should in fact be unique for every row in your results and it's pointless to have a group by on that column at all.

Also I'll note that there doesn't seem to be a purpose in your join to the team_memberships table unless you need to guarantee that a team membership exists for each pairs of user and team ids from the points table. Here's the fix:

SELECT u.id, min(u.slug) as slug, SUM(pp.points) AS total
FROM
    users u
    JOIN performance_points pp ON pp.user_id = u.id
    JOIN team_memberships tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id
WHERE pp.date > '2015-08-02 13:57:14.042221'
GROUP BY u.id
ORDER by total desc

This answer is essentially equivalent to @philipxy and @Hambone's. As you can see it's not strictly necessary to use some of the constructs they chose. Hopefully my explanation of what went wrong is helpful whichever approach you prefer.

答案 2 :(得分:1)

Without seeing all of your data it's a little hard to guess, but maybe a CTE to pre-process the performance points would do it:

with pp_totals as (
  select user_id, sum (points) as points
  from performance_points
  where date > '2015-08-02 13:57:14.042221'
  group by user_id
)
SELECT
  u.id, u.slug, pp.points AS total
FROM
  users u
  JOIN pp_totals pp ON pp.user_id = u.id
  JOIN team_memberships tm ON tm.user_id = u.user_id
ORDER BY pp.points DESC
limit 50

If this doesn't do it, can you create a SQL Fiddle and post it to your question?

答案 3 :(得分:0)

尝试以下查询,让我们知道答案,如果有效:

SELECT u.id, u.slug, SUM(pp.points) AS total
FROM users u
INNER JOIN (select user_id,date,team_id, SUM(points) as points from performance_points group by user_id,date,team_id) pp ON pp.user_id = u.id
INNER JOIN (select team_id, user_id from team_memberships group by team_id, user_id) tm ON tm.team_id = pp.team_id AND tm.user_id = pp.user_id
WHERE (pp.date > '2015-08-02 13:57:14.042221')
GROUP BY u.id, u.slug
ORDER BY total DESC
LIMIT 50
;

答案 4 :(得分:0)

我发现查询实际上没有问题,但是数据存在问题。有些用户不止一次在多个团队中,并且存在问题。