获取相关项目的最佳SQL 2005查询?

时间:2009-06-27 16:49:21

标签: linq sql-server-2005 tsql

我有一个小视频网站,我想根据最匹配的标签获取相关视频。获取相关视频的最佳MSSQL 2005查询是什么?

LINQ查询也将受到赞赏。


架构:

CREATE TABLE Videos
    (VideoID bigint not null , 
    Title varchar(100) NULL, 
    isActive bit NULL  )

CREATE TABLE Tags
    (TagID bigint not null , 
    Tag varchar(100) NULL )

CREATE TABLE VideoTags
    (VideoID bigint not null , 
    TagID bigint not null )

每个视频可以有多个标签。现在我想根据标签获取相关视频,但只能获得与大多数标签匹配的视频。匹配最多的视频应排在最前面,匹配较少的视频应位于底部,如果没有匹配的标签,则不应返回任何视频。

另外,我想知道如果我为每个视频说超过一百万个视频和10-20个标签,那么上面的架构是可以的。

4 个答案:

答案 0 :(得分:1)

这是sql

SELECT v.VideoID, v.Title, v.isActive
FROM Videos v
  JOIN 
(
  SELECT vt.VideoID, Count(*) as MatchCount
  FROM VideoTags vt
  WHERE vt.TagID in
  (
    SELECT TagID
    FROM Tags t
    WHERE t.Tag in ('horror', 'scifi')
  )
  GROUP BY vt.VideoID
) as sub
  ON v.VideoID = sub.VideoID
ORDER BY sub.MatchCount desc

这是Linq。

List<string> TagList = new List<string>() {"horror", "scifi"};

  //find tag ids.
var tagQuery =
  from t in db.Tags
  where TagList.Contains(t.Tag))
  select t.TagID

  //find matching video ids, count matches for each
var videoTagQuery =
  from vt in db.VideoTags
  where tagQuery.Contains(vt.TagID)
  group vt by vt.VideoID into g
  select new { VideoID = g.Key, matchCount = g.Count;

  //fetch videos where matches were found
  //ordered by the number of matches
var videoQuery =
  from v in db.Videos
  join x in videoTagQuery on v.VideoID equals x.VideoID
  orderby x.matchCount
  select v
  //hit the database and pull back the results
List<Video> result = videoQuery.ToList();

哦等等 - 你没有标签清单,你有一个视频,想要有类似标签的视频。确定:

SELECT v.VideoID, v.Title, v.isActive
FROM Videos v
  JOIN 
(
  SELECT vt.VideoID, Count(*) as MatchCount
  FROM VideoTags vt
  WHERE vt.TagID in
  (
    SELECT TagID
    FROM VideoTags vt2
    WHERE vt2.VideoID = @VideoID
  )
  GROUP BY vt.VideoID
) as sub
  ON v.VideoID = sub.VideoID
ORDER BY sub.MatchCount desc

Linq与标签查询更改相同

int myVideoID = 4

  //find tag ids.
var tagQuery =
  from t in db.VideoTags
  where t.VideoID = myVideoID
  select t.TagID

答案 1 :(得分:0)

这样的事情是你的事吗?

String horror = "Horror";
String thriller = "Thriller";

var results =
    from v in db.Videos
    join vt in db.VideoTags on v.VideoId equals vt.VideoId
    join t in db.Tags on vt.TagId equals t.TagId
    where
        t.Tag == horror || t.Tag == thriller
    select v;

答案 2 :(得分:0)

此查询将按照相关的标签数量(按降序排列)获取视频:

select video.videoId, Title, count(*) nroOfTags
from videos, VideoTags
where
videoTags.videoid = videos.videoID 
and tagId in ('horror','action','adventure')
group by video.videoId, Title
order by count(*) desc

关于数据模型,没关系。假设所有适当的索引到位,它将很好地工作。

答案 3 :(得分:0)

我在DDL中进行了一些更改:

CREATE TABLE [Tags](
    [TagID] [bigint] IDENTITY(1,1) NOT NULL,
    [Tag] [nvarchar](100) NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [TagID] ASC
),
 CONSTRAINT [UC_Tags] UNIQUE NONCLUSTERED 
(
    [Tag] ASC
)
)

GO

CREATE TABLE [Videos](
    [VideoID] [bigint] IDENTITY(1,1) NOT NULL,
    [Title] [nvarchar](100) NOT NULL,
    [isActive] [bit] NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [VideoID] ASC
),
 CONSTRAINT [UC_Videos] UNIQUE NONCLUSTERED 
(
    [Title] ASC
)
)

GO

CREATE TABLE [VideoTags](
    [VideoID] [bigint] NOT NULL,
    [TagID] [bigint] NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [VideoID] ASC,
    [TagID] ASC
)
)

GO

ALTER TABLE [VideoTags]  WITH CHECK ADD FOREIGN KEY([TagID])
REFERENCES [Tags] ([TagID])
GO

ALTER TABLE [VideoTags]  WITH CHECK ADD FOREIGN KEY([VideoID])
REFERENCES [Videos] ([VideoID])
GO
  1. 我将文本列设为nvarchar。使跟踪“外国”电影更容易。
  2. 我会将ID列IDENTITY列设为主键
  3. 我指定了外键
  4. 我将Tag和Title列设为唯一。您不需要重复的标题或标签
  5. 我会让所有这些列都不可为空。拥有一个名字未知的视频或标签是没有意义的,视频是活动的也不是非活动的,从不“可能”或“未知”。
  6. 我在VideoTags中添加了一个主键,以防止重复。
  7. 对于SQL查询,我会尝试以下操作。如果没有测试数据,我无法确定它是什么:

    ;
    WITH VIDEO_TAG_COUNTS(VideoID,TagCount)
    AS
    (
        SELECT v.VideoID, COUNT(*)
        FROM Videos V
        INNER JOIN VideoTags VT ON V.VideoID = VT.VideoID
        GROUP BY V.VideoID
    )
    SELECT V.VideoID, V.Title
    FROM Videos V 
    INNER JOIN VIDEO_TAG_COUNTS VTC ON V.VideoID = VTC.VideoID
    WHERE V.isActive = 1
    ORDER BY VTC.TagCount