RavenDB:从多个博客索引博客帖子标签

时间:2012-04-19 12:46:47

标签: linq mapreduce ravendb indexing

如何为以下方案创建适当的AbstractIndexCreationTask

对于多个博客的情况,如何从特定博客标记计数获取这些标记?

存储在RavenDB中的数据结构感兴趣的成员:

public class BlogPost {
    public string BlogKey { get; set; }
    public IEnumerable<string> Tags { get; set; }
    /* ... */
}

我需要实现的方法具有以下签名:

public Dictionary<string, int> GetTagsByBlogs(string tag, params string[] blogKeys)

在正常的LINQ中我会写这样的东西:

var tags = from post in blogPosts
           from tag in post.Tags
           where blogKeys.Contains(post.BlogKey)
           group tag by tag into g
           select new {
               Tag = g.Key,
               Count = g.Count(),
           };

但RavenDB不支持SelectManyGroupBy。我已经为map-reduce解决方案尝试了不同的组合,但我无法弄清楚如何做到这一点,因为地图和数据结构中的reduce有所不同

2 个答案:

答案 0 :(得分:3)

如何创建标签云是RavenDB的described in the knowledge base

在您的情况下,您必须在索引中包含BlogKey,尤其是在group by子句中:

public class Tags_Count : AbstractIndexCreationTask<BlogPost, Tags_Count.ReduceResult>
{
    public class ReduceResult
    {
        public string BlogKey { get; set; }
        public string Name { get; set; }
        public int Count { get; set; }
    }

    public Tags_Count()
    {
        Map = posts => from post in posts
                       from tag in post.Tags
                       select new { 
                           BlogKey = post.BlogKey,
                           Name = tag.ToString().ToLower(), 
                           Count = 1 
                       };
        Reduce = results => from tagCount in results
                            group tagCount by new { 
                                tagCount.BlogKey,  
                                tagCount.Name } into g
                            select new {
                                BlogKey = g.Key.BlogKey,
                                Name = g.Key.Name, 
                                Count = g.Sum(x => x.Count) 
                            };

        Sort(result => result.Count, SortOptions.Int); 
    }
}

然后使用所需的BlogKey查询该索引:

var result = session.Query<Tags_Count.ReduceResult, Tags_Count>()
    .Where(x => x.BlogKey = myBlogKey)
    .OrderByDescending(x => x.Count)
    .ToArray();

如果您需要查询多个博客,可以尝试以下查询:

var tagsByBlogs = session.Query<Tags_Count.ReduceResult, Tags_Count>()
    .Where(x => x.BlogKey.In<string>(blogKeys))
    .OrderByDescending(x => x.Count)
    .ToArray();

AFAIK就索引而言。您仍然必须像在原始问题中那样在客户端聚合结果,除了您可以跳过blogKeys上的过滤:

var tags = from tag in tagsByBlogs
           group tag by Name into g
           select new {
               Tag = g.Key,
               Count = g.Count(),
           };

答案 1 :(得分:1)

查看faceted search,您可以在查询时指定条件,如下所示:

var facetResults = s.Query<BlogPost>("BlogIndex") 
                        .Where(x => x.BlogKey == "1" || x.BlogKey == "5" ...) 
                        .ToFacets("facets/BlogFacets");

然后对与where子句匹配的所有结果进行分组(和计数)。

更新您需要一个如下所示的索引:

from post in blogPosts
from tag in post.Tags 
select new 
{
    post.BlogKey
    Tag = tag     
}