Question

我有一个显示帖子的网站。我希望网站的滚动行为像twitter一样 - 向下滚动会无限地显示越来越多的帖子。假设我有以下表格：

用于保存所有帖子的帖子表。每个帖子都与一个人有关

CREATE TABLE [dbo].[Post](
    [Id] [bigint] IDENTITY(1,1) NOT NULL,
    [PersonId] [int] NOT NULL,
        [PublishDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Post] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

一个PostTag表，用于保存每个帖子的所有相关标签。

CREATE TABLE [dbo].[PostTag](
    [PostId] [bigint] NOT NULL,
    [TagId] [int] NOT NULL,
 CONSTRAINT [PK_PostTag] PRIMARY KEY CLUSTERED 
(
    [PostId] ASC,
    [TagId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

对于网站的每个用户，UserPersonStatistics表保存他对相关人员感兴趣的次数。

CREATE TABLE [dbo].[UserPersonStatistics](
    [UserId] [bigint] NOT NULL,
    [PersonId] [int] NOT NULL,
    [Counter] [bigint] NOT NULL,
 CONSTRAINT [PK_UserPersonStatistics] PRIMARY KEY CLUSTERED 
(
    [UserId] ASC,
    [PersonId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

对于网站的每个用户，UserPostStatistics表保存他对帖子感兴趣的次数。

CREATE TABLE [dbo].[UserPostStatistics](
    [UserId] [bigint] NOT NULL,
    [PostId] [bigint] NOT NULL,
    [Counter] [bigint] NOT NULL,
 CONSTRAINT [PK_UserPostStatistics] PRIMARY KEY CLUSTERED 
(
    [UserId] ASC,
    [PostId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

对于网站的每个用户，UserTagStatistic表保存他对标签相关帖子感兴趣的次数。

CREATE TABLE [dbo].[UserTagStatistics](
    [UserId] [bigint] NOT NULL,
    [TagId] [int] NOT NULL,
    [Counter] [bigint] NOT NULL,
 CONSTRAINT [PK_UserTagStatistics] PRIMARY KEY CLUSTERED 
(
    [UserId] ASC,
    [TagId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
       ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

我需要的是一个存储过程，每个用户每次返回35个不同的帖子，“记住”最后35个帖子，这样它就不会再次返回相同的帖子，35个帖子应该包括： 15个帖子为最流行的标签（UserTagStatistics） 15个帖子为最受欢迎的人（UserPersonStatistics） 5个最受欢迎的帖子（UserPostStatistics）

一个问题是该程序每次应该返回35个不同的帖子。还有一个问题是，帖子可以作为最受欢迎的帖子返回一次，一次作为最受欢迎的标签的帖子，一次作为最受欢迎的人的帖子。这篇文章应该计算一次，而不是三次。存储过程的性能至关重要。

我知道这是一个非常复杂的问题。任何想法都表示赞赏。

kruvi

Answer 1

向所有表添加“LastViewed”日期时间字段，然后使用这样的proc。为了提高性能，只需确保在三个表中的每一个表上都有UserID + LastViewed + Counter和UserID + PersonID的索引，它应该尖叫。实际上，由于UserID + LastViewed + Counter几乎就是整个表格，如果可能的话，我建议你在每个表格上使用clustered index，这样就可以避免创建第二个索引，这个索引的大小基本上与原始表。

   create proc GetInfo(@UserId bigint) as
    begin
        update userpersonstatistics 
        set 
            lastviewed=getdate() 
        where 
            userid=@UserID and personid in 
                (
                select top 15 personid from userpersonstatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userpersonstatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        select * from UserPersonStatistics 
               where UserID=@UserID and LastViewed  = 
            (select max(lastviewed) from UserTagStatistics)

        --**Repeat the above code for UserPostStatistics and UserTagStatistics
    end

基于输入的修订PROC：

 create proc GetInfo(@UserId bigint) as
    begin
        declare @lastviewed datetime
        declare @results TABLE
        (
          StatType varchar(10),
          Counter int,
          PostID
        )

        set @lastviewed = getdate()

        --Person
        insert into @results(StatType,Counter,PostID)
        select 
            'Person',counter,PostID
        from
            UserPersonStatistics
        where 
            userid=@UserID and personid in 
                (
                select top 35 personid from userpersonstatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userpersonstatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        --Post
        insert into @results(StatType,Counter,PostID)
        select 
            'Post',counter,PostID
        from
            UserPostStatistics
        where 
            userid=@UserID and Postid in 
                (
                select top 35 Postid from userPoststatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userPoststatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        --Tag
        insert into @results(StatType,Counter,TagID)
        select 
            'Tag',counter,TagID
        from
            UserTagStatistics
        where 
            userid=@UserID and Tagid in 
                (
                select top 35 Tagid from userTagstatistics
                where 
                    userid=@UserID and 
                    (
                    lastviewed is null or lastviewed != 
                        (select max(lastviewed) from userTagstatistics
                         where userid=@UserID)
                    )    
                order by counter desc
                )


        --At this point you could have 105 rows of the various types (35*3).
        --You can use whatever algorithm you need to decide the top 35.
        --That may include some weighting.  
            --You may want to consider using the Rank() function.
    end

如果您的算法应该在＃2之前考虑每个类别的＃1顶级计数器，请查看Rank() function。

使用存储过程滚动为twitter时显示行

1 个答案: