使用存储过程滚动为twitter时显示行

时间:2012-02-18 11:26:34

标签: sql-server database stored-procedures twitter

我有一个显示帖子的网站。我希望网站的滚动行为像twitter一样 - 向下滚动会无限地显示越来越多的帖子。     假设我有以下表格:

用于保存所有帖子的帖子表。每个帖子都与一个人有关

CREATE TABLE [dbo].[Post](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [PersonId] [int] NOT NULL,
        [PublishDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Post] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

一个PostTag表,用于保存每个帖子的所有相关标签。

CREATE TABLE [dbo].[PostTag](
    [PostId] [bigint] NOT NULL,
    [TagId] [int] NOT NULL,
 CONSTRAINT [PK_PostTag] PRIMARY KEY CLUSTERED 
(
    [PostId] ASC,
    [TagId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

对于网站的每个用户,UserPersonStatistics表保存他对相关人员感兴趣的次数。

CREATE TABLE [dbo].[UserPersonStatistics](
    [UserId] [bigint] NOT NULL,
    [PersonId] [int] NOT NULL,
    [Counter] [bigint] NOT NULL,
 CONSTRAINT [PK_UserPersonStatistics] PRIMARY KEY CLUSTERED 
(
    [UserId] ASC,
    [PersonId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

对于网站的每个用户,UserPostStatistics表保存他对帖子感兴趣的次数。

CREATE TABLE [dbo].[UserPostStatistics](
    [UserId] [bigint] NOT NULL,
    [PostId] [bigint] NOT NULL,
    [Counter] [bigint] NOT NULL,
 CONSTRAINT [PK_UserPostStatistics] PRIMARY KEY CLUSTERED 
(
    [UserId] ASC,
    [PostId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

对于网站的每个用户,UserTagStatistic表保存他对标签相关帖子感兴趣的次数。

CREATE TABLE [dbo].[UserTagStatistics](
    [UserId] [bigint] NOT NULL,
    [TagId] [int] NOT NULL,
    [Counter] [bigint] NOT NULL,
 CONSTRAINT [PK_UserTagStatistics] PRIMARY KEY CLUSTERED 
(
    [UserId] ASC,
    [TagId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

我需要的是一个存储过程,每个用户每次返回35个不同的帖子,“记住”最后35个帖子,这样它就不会再次返回相同的帖子,35个帖子应该包括: 15个帖子为最流行的标签(UserTagStatistics) 15个帖子为最受欢迎的人(UserPersonStatistics) 5个最受欢迎的帖子(UserPostStatistics)

一个问题是该程序每次应该返回35个不同的帖子。 还有一个问题是,帖子可以作为最受欢迎的帖子返回一次,一次作为最受欢迎的标签的帖子,一次作为最受欢迎的人的帖子。这篇文章应该计算一次,而不是三次。 存储过程的性能至关重要。

我知道这是一个非常复杂的问题。 任何想法都表示赞赏。

kruvi

1 个答案:

答案 0 :(得分:1)

向所有表添加“LastViewed”日期时间字段,然后使用这样的proc。为了提高性能,只需确保在三个表中的每一个表上都有UserID + LastViewed + Counter和UserID + PersonID的索引,它应该尖叫。实际上,由于UserID + LastViewed + Counter几乎就是整个表格,如果可能的话,我建议你在每个表格上使用clustered index,这样就可以避免创建第二个索引,这个索引的大小基本上与原始表。

   create proc GetInfo(@UserId bigint) as
    begin
        update userpersonstatistics 
        set 
            lastviewed=getdate() 
        where 
            userid=@UserID and personid in 
                (
                select top 15 personid from userpersonstatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userpersonstatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        select * from UserPersonStatistics 
               where UserID=@UserID and LastViewed  = 
            (select max(lastviewed) from UserTagStatistics)

        --**Repeat the above code for UserPostStatistics and UserTagStatistics
    end

基于输入的修订PROC:

 create proc GetInfo(@UserId bigint) as
    begin
        declare @lastviewed datetime
        declare @results TABLE
        (
          StatType varchar(10),
          Counter int,
          PostID
        )

        set @lastviewed = getdate()

        --Person
        insert into @results(StatType,Counter,PostID)
        select 
            'Person',counter,PostID
        from
            UserPersonStatistics
        where 
            userid=@UserID and personid in 
                (
                select top 35 personid from userpersonstatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userpersonstatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        --Post
        insert into @results(StatType,Counter,PostID)
        select 
            'Post',counter,PostID
        from
            UserPostStatistics
        where 
            userid=@UserID and Postid in 
                (
                select top 35 Postid from userPoststatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userPoststatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        --Tag
        insert into @results(StatType,Counter,TagID)
        select 
            'Tag',counter,TagID
        from
            UserTagStatistics
        where 
            userid=@UserID and Tagid in 
                (
                select top 35 Tagid from userTagstatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userTagstatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        --At this point you could have 105 rows of the various types (35*3).
        --You can use whatever algorithm you need to decide the top 35.
        --That may include some weighting.  
            --You may want to consider using the Rank() function.
    end

如果您的算法应该在#2之前考虑每个类别的#1顶级计数器,请查看Rank() function