具体的复杂SQL查询和Django ORM?

时间:2012-05-20 14:59:43

标签: sql django orm

我有一组表格,其中包含用户创建和投票的内容。

content_a

id         /* the id of the content */
user_id    /* the user that contributed the content */
content    /* the content */

content_b

id
user_id
content

content_c

id
user_id
content

投票

user_id         /* the user that made the vote */
content_id      /* the content the vote was made on */
content_type_id /* the content type the vote was made on */
vote            /* the value of the vote, either +1 or -1 */

我希望能够选择一组用户,并根据他们制作的内容的总票数对其进行排序。例如,

SELECT * FROM users ORDER BY <sum of votes on all content associated with user>

有没有一种特定的方法可以使用Django的ORM实现,或者我是否必须使用原始SQL查询?在原始SQL中实现这一目标的最有效方法是什么?

3 个答案:

答案 0 :(得分:7)

更新

假设模型是

from django.contrib.contenttypes import generic
from django.contrib.contenttypes.models import ContentType


class ContentA(models.Model):
    user = models.ForeignKey(User)
    content = models.TextField()

class ContentB(models.Model):
    user = models.ForeignKey(User)
    content = models.TextField()

class ContentC(models.Model):
    user = models.ForeignKey(User)
    content = models.TextField()

class GenericVote(models.Model):
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = generic.GenericForeignKey()
    user = models.ForeignKey(User)
    vote = models.IntegerField(default=1)

选项A.使用GenericVote

GenericVote.objects.extra(select={'uid':"""
CASE
WHEN content_type_id = {ct_a} THEN (SELECT user_id FROM {ContentA._meta.db_table} WHERE id = object_id)
WHEN content_type_id = {ct_b} THEN (SELECT user_id FROM {ContentB._meta.db_table} WHERE id = object_id)
WHEN content_type_id = {ct_c} THEN (SELECT user_id FROM {ContentC._meta.db_table} WHERE id = object_id)
END""".format(
ct_a=ContentType.objects.get_for_model(ContentA).pk,
ct_b=ContentType.objects.get_for_model(ContentB).pk,
ct_c=ContentType.objects.get_for_model(ContentC).pk,
ContentA=ContentA,
ContentB=ContentB,
ContentC=ContentC
)}).values('uid').annotate(vc=models.Sum('vote')).order_by('-vc')

以上ValuesQuerySet,(或使用values_list())按递减计票次数的顺序为您提供User() s的ID序列。然后,您可以使用它来获取最高用户。

选项B.使用User.objects.raw

当我使用User.objects.raw时,我得到几乎相同的查询w / the answer given by forsvarir

User.objects.raw("""
SELECT "{user_tbl}".*, SUM("gv"."vc") as vote_count from {user_tbl},
    (SELECT id, user_id, {ct_a} AS ct FROM {ContentA._meta.db_table} UNION
     SELECT id, user_id, {ct_b} AS ct FROM {ContentB._meta.db_table} UNION
     SELECT id, user_id, {ct_c} as ct FROM {ContentC._meta.db_table}
    ) as c,
   (SELECT content_type_id, object_id, SUM("vote") as vc FROM {GenericVote._meta.db_table} GROUP BY content_type_id, object_id) as gv
WHERE {user_tbl}.id = c.user_id
    AND gv.content_type_id = c.ct
    AND gv.object_id = c.id
GROUP BY {user_tbl}.id
ORDER BY "vc" DESC""".format(
    user_tbl=User._meta.db_table, ContentA=ContentA, ContentB=ContentB,
    ContentC=ContentC, GenericVote=GenericVote, 
    ct_a=ContentType.objects.get_for_model(ContentA).pk,
    ct_b=ContentType.objects.get_for_model(ContentB).pk,
    ct_c=ContentType.objects.get_for_model(ContentC).pk
))

选项C.其他可能的方式

  • vote_countUser或个人资料模型(例如UserProfile或其他相对模型)反规范化为suggested by Michael Dunn。如果您经常访问vote_count,这种情况会好得多。
  • 构建一个DB视图,为您执行UNION,然后将模型映射到它,这可以使查询的构造更容易。
  • 使用Python排序,通常这是处理大规模数据的最佳方式,因为有十几种工具包和扩展方式。

在使用Django ORM进行查询之前,您需要一些Django模型映射这些表。假设它们是匹配UserVoting表的usersvoting个模型,那么您可以

User.objects.annotate(v=models.Sum('voting__vote')).order_by('v')

答案 1 :(得分:3)

对于原始SQL解决方案,我在ideone here

上创建了问题的粗略复制

数据设置:

create table content_a(id int, user_id int, content varchar(20));
create table content_b(id int, user_id int, content varchar(20));
create table content_c(id int, user_id int, content varchar(20));
create table voting(user_id int, content_id int, content_type_id int, vote int);
create table users(id int, name varchar(20));
insert into content_a values(1,1,'aaaa');
insert into content_a values(2,1,'bbbb');
insert into content_a values(3,1,'cccc');
insert into content_b values(1,2,'dddd');
insert into content_b values(2,2,'eeee');
insert into content_b values(3,2,'ffff');
insert into content_c values(1,1,'gggg');
insert into content_c values(2,2,'hhhh');
insert into content_c values(3,3,'iiii');
insert into users values(1, 'first');
insert into users values(2, 'second');
insert into users values(3, 'third');
insert into users values(4, 'voteonly');

-- user 1 net votes (2)
insert into voting values (1, 1, 1, 1);
insert into voting values (2, 3, 1, -1);
insert into voting values (3, 1, 1, 1); 
insert into voting values (4, 2, 1, 1); 

-- user 2 net votes (3)
insert into voting values (1, 2, 2, 1);
insert into voting values (1, 1, 2, 1);
insert into voting values (2, 3, 2, -1);
insert into voting values (4, 2, 2, 1);
insert into voting values (4, 2, 3, 1);

-- user 3 net votes (-1)
insert into voting values (2, 3, 3, -1);

我基本上假设content_a的类型为1,content_b的类型为2,content_c的类型为3.使用原始SQL,似乎有两种明显的方法。第一种是将所有内容合并在一起,然后将其与用户和投票表连接起来。我在下面测试了这种方法。

select users.*, sum(voting.vote)
from users, 
    voting, (
        SELECT     id, 1 AS content_type_id, user_id
        FROM         content_a
        UNION
        SELECT     id, 2 AS content_type_id, user_id
        FROM         content_b
        UNION
        SELECT     id, 3 AS content_type_id, user_id
        FROM         content_c) contents
where contents.user_id = users.id
and voting.content_id = contents.id
and voting.content_type_id = contents.content_type_id
group by users.id
order by sum(voting.vote) desc;

替代方案似乎是将内容表外连接到投票表,而没有联合步骤。这可能更高性能,但我无法测试它,因为visual studio不断为我重写我的SQL ...我希望SQL看起来像这样(但我还没有测试过):< / p>

select users.*, sum(voting.vote)
from users, voting, content_a, content_b, content_c
where users.id = content_a.user_id (+)
and users.id = content_b.user_id (+)
and users.id = content_c.user_id (+)
and ((content_a.id = voting.content_id and voting.content_type_id = 1) OR
     (content_b.id = voting.content_id and voting.content_type_id = 2) OR
     (content_c.id = voting.content_id and voting.content_type_id = 3))
group by users.id
order by sum(voting.vote) desc;

答案 2 :(得分:0)

我会使用预先计算的值来做到这一点。首先制作一个单独的表来存储每个用户收到的投票:

class VotesReceived(models.Model):
    user = models.OneToOneField(User, primary_key=True)
    count = models.IntegerField(default=0, editable=False)

然后使用post_save signal每次投票时更新计数:

def update_votes_received(sender, instance, **kwargs):
    # `instance` is a Voting object
    # assuming here that `instance.content.user` is the creator of the content
    vr, _ = VotesReceived.objects.get_or_create(user=instance.content.user)
    # you should recount the votes here rather than just incrementing the count
    vr.count += 1 
    vr.save()

models.signals.post_save.connect(update_votes_received, sender=Voting)

用法:

user = User.objects.get(id=1)
print user.votesreceived.count

如果您的数据库中已有数据,则必须在第一次手动更新投票计数。