RavenDB:如何在map-reduce中正确索引笛卡尔积?

时间:2017-01-12 13:06:03

标签: ravendb cartesian-product cross-join

这个问题是RavenDB: Why do I get null-values for fields in this multi-map/reduce index?的副产品,但我意识到,问题是另一个问题。

考虑我极其简化的域名,重写为电影租赁商店场景以进行抽象:

public class User
{
    public string Id { get; set; }
}

public class Movie
{
    public string Id { get; set; }
}

public class MovieRental
{
    public string Id { get; set; }
    public string MovieId { get; set; }
    public string UserId { get; set; }
}

这是一本多对多的教科书。

我想要创建的索引是:

  

对于给定的用户,请给我一个数据库中每部电影的列表(暂时过滤/搜索)以及描述用户租借此电影的次数(或零)的整数。

基本上是这样的:

用户:

| Id     |
|--------|
| John   |
| Lizzie |
| Albert |

影:

| Id           |
|--------------|
| Robocop      |
| Notting Hill |
| Inception    |

MovieRentals:

| Id        | UserId | MovieId      |
|-----------|--------|--------------|
| rental-00 | John   | Robocop      |
| rental-01 | John   | Notting Hill |
| rental-02 | John   | Notting Hill |
| rental-03 | Lizzie | Robocop      |
| rental-04 | Lizzie | Robocop      |
| rental-05 | Lizzie | Inception    |

理想情况下,我想要一个查询索引,如下所示:

| UserId | MovieId      | RentalCount |
|--------|--------------|-------------|
| John   | Robocop      | 1           |
| John   | Notting Hill | 2           |
| John   | Inception    | 0           |
| Lizzie | Robocop      | 2           |
| Lizzie | Notting Hill | 0           |
| Lizzie | Inception    | 1           |
| Albert | Robocop      | 0           |
| Albert | Notting Hill | 0           |
| Albert | Inception    | 0           |

或以声明的方式:

  • 我一直想要所有电影的完整列表(最终我会添加过滤/搜索) - 即使在提供从未租过一部电影的用户时
  • 我想要计算每个用户的租金,只需要整数
  • 我希望能够按租金计算排序 - 即在列表顶部为给定用户显示最多租用的电影

但是,我无法找到一种方法来进行交叉连接。上面并将其保存在索引中。相反,我最初认为我在下面的操作中做到了正确,但它不允许我排序(参见测试失败):

  

{"不支持计算:x.UserRentalCounts.SingleOrDefault(rentalCount =>(rentalCount.UserId == value(UnitTestProject2.MovieRentalTests +<> c__DisplayClass0_0).user_john.Id))。计数。您不能在RavenDB查询中使用计算(只允许使用简单的成员表达式)。"}

我的问题基本上是:我怎么能 - 或者我可以 - 索引,我的要求是否满足?

以下是我提到的例子,它不符合我的要求,但这就是我现在所处的位置。它使用以下包(VS2015):

packages.config

<?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="Microsoft.Owin.Host.HttpListener" version="3.0.1" targetFramework="net461" />
  <package id="NUnit" version="3.5.0" targetFramework="net461" />
  <package id="RavenDB.Client" version="3.5.2" targetFramework="net461" />
  <package id="RavenDB.Database" version="3.5.2" targetFramework="net461" />
  <package id="RavenDB.Tests.Helpers" version="3.5.2" targetFramework="net461" />
</packages>

MovieRentalTests.cs

using System.Collections.Generic;
using System.Linq;
using NUnit.Framework;
using Raven.Client.Indexes;
using Raven.Client.Linq;
using Raven.Tests.Helpers;

namespace UnitTestProject2
{
    [TestFixture]
    public class MovieRentalTests : RavenTestBase
    {
        [Test]
        public void DoSomeTests()
        {
            using (var server = GetNewServer())
            using (var store = NewRemoteDocumentStore(ravenDbServer: server))
            {
                //Test-data
                var user_john = new User { Id = "John" };
                var user_lizzie = new User { Id = "Lizzie" };
                var user_albert = new User { Id = "Albert" };


                var movie_robocop = new Movie { Id = "Robocop" };
                var movie_nottingHill = new Movie { Id = "Notting Hill" };
                var movie_inception = new Movie { Id = "Inception" };

                var rentals = new List<MovieRental>
                {
                    new MovieRental {Id = "rental-00", UserId = user_john.Id, MovieId = movie_robocop.Id},
                    new MovieRental {Id = "rental-01", UserId = user_john.Id, MovieId = movie_nottingHill.Id},
                    new MovieRental {Id = "rental-02", UserId = user_john.Id, MovieId = movie_nottingHill.Id},
                    new MovieRental {Id = "rental-03", UserId = user_lizzie.Id, MovieId = movie_robocop.Id},
                    new MovieRental {Id = "rental-04", UserId = user_lizzie.Id, MovieId = movie_robocop.Id},
                    new MovieRental {Id = "rental-05", UserId = user_lizzie.Id, MovieId = movie_inception.Id}
                };

                //Init index
                new Movies_WithRentalsByUsersCount().Execute(store);

                //Insert test-data in db
                using (var session = store.OpenSession())
                {
                    session.Store(user_john);
                    session.Store(user_lizzie);
                    session.Store(user_albert);

                    session.Store(movie_robocop);
                    session.Store(movie_nottingHill);
                    session.Store(movie_inception);

                    foreach (var rental in rentals)
                    {
                        session.Store(rental);
                    }

                    session.SaveChanges();

                    WaitForAllRequestsToComplete(server);
                    WaitForIndexing(store);
                }

                //Test of correct rental-counts for users
                using (var session = store.OpenSession())
                {
                    var allMoviesWithRentalCounts =
                        session.Query<Movies_WithRentalsByUsersCount.ReducedResult, Movies_WithRentalsByUsersCount>()
                            .ToList();

                    var robocopWithRentalsCounts = allMoviesWithRentalCounts.Single(m => m.MovieId == movie_robocop.Id);
                    Assert.AreEqual(1, robocopWithRentalsCounts.UserRentalCounts.FirstOrDefault(x => x.UserId == user_john.Id)?.Count ?? 0);
                    Assert.AreEqual(2, robocopWithRentalsCounts.UserRentalCounts.FirstOrDefault(x => x.UserId == user_lizzie.Id)?.Count ?? 0);
                    Assert.AreEqual(0, robocopWithRentalsCounts.UserRentalCounts.FirstOrDefault(x => x.UserId == user_albert.Id)?.Count ?? 0);

                    var nottingHillWithRentalsCounts = allMoviesWithRentalCounts.Single(m => m.MovieId == movie_nottingHill.Id);
                    Assert.AreEqual(2, nottingHillWithRentalsCounts.UserRentalCounts.FirstOrDefault(x => x.UserId == user_john.Id)?.Count ?? 0);
                    Assert.AreEqual(0, nottingHillWithRentalsCounts.UserRentalCounts.FirstOrDefault(x => x.UserId == user_lizzie.Id)?.Count ?? 0);
                    Assert.AreEqual(0, nottingHillWithRentalsCounts.UserRentalCounts.FirstOrDefault(x => x.UserId == user_albert.Id)?.Count ?? 0);
                }

                // Test that you for a given user can sort the movies by view-count
                using (var session = store.OpenSession())
                {
                    var allMoviesWithRentalCounts =
                        session.Query<Movies_WithRentalsByUsersCount.ReducedResult, Movies_WithRentalsByUsersCount>()
                            .OrderByDescending(x => x.UserRentalCounts.SingleOrDefault(rentalCount => rentalCount.UserId == user_john.Id).Count)
                            .ToList();

                    Assert.AreEqual(movie_nottingHill.Id, allMoviesWithRentalCounts[0].MovieId);
                    Assert.AreEqual(movie_robocop.Id, allMoviesWithRentalCounts[1].MovieId);
                    Assert.AreEqual(movie_inception.Id, allMoviesWithRentalCounts[2].MovieId);
                }
            }
        }

        public class Movies_WithRentalsByUsersCount :
            AbstractMultiMapIndexCreationTask<Movies_WithRentalsByUsersCount.ReducedResult>
        {
            public Movies_WithRentalsByUsersCount()
            {
                AddMap<MovieRental>(rentals =>
                    from r in rentals
                    select new ReducedResult
                    {
                        MovieId = r.MovieId,
                        UserRentalCounts = new[] { new UserRentalCount { UserId = r.UserId, Count = 1 } }
                    });

                AddMap<Movie>(movies =>
                    from m in movies
                    select new ReducedResult
                    {
                        MovieId = m.Id,
                        UserRentalCounts = new[] { new UserRentalCount { UserId = null, Count = 0 } }
                    });

                Reduce = results =>
                    from result in results
                    group result by result.MovieId
                    into g
                    select new
                    {
                        MovieId = g.Key,
                        UserRentalCounts = (
                                from userRentalCount in g.SelectMany(x => x.UserRentalCounts)
                                group userRentalCount by userRentalCount.UserId
                                into subGroup
                                select new UserRentalCount { UserId = subGroup.Key, Count = subGroup.Sum(b => b.Count) })
                            .ToArray()
                    };
            }

            public class ReducedResult
            {
                public string MovieId { get; set; }
                public UserRentalCount[] UserRentalCounts { get; set; }
            }

            public class UserRentalCount
            {
                public string UserId { get; set; }
                public int Count { get; set; }
            }
        }

        public class User
        {
            public string Id { get; set; }
        }

        public class Movie
        {
            public string Id { get; set; }
        }

        public class MovieRental
        {
            public string Id { get; set; }
            public string MovieId { get; set; }
            public string UserId { get; set; }
        }
    }
}

1 个答案:

答案 0 :(得分:1)

由于您的要求是针对给定用户&#34;,如果您真的只关注单个用户,则可以使用多地图索引执行此操作。使用Movies表本身生成基线零计数记录,然后在用户的实际MovieRentals记录中进行映射。

如果你真的需要所有与所有电影交叉的用户,我不相信有办法用RavenDB干净利落地完成这项工作,因为这会被视为reporting which is noted as one of the sour spots for RavenDB

如果您真的想尝试使用RavenDB执行此操作,可以使用以下选项:

1)在DB中为每个用户和每部电影创建虚拟记录,并使用索引中的0记录。无论何时添加/更新/删除电影或用户,都应相应地更新虚拟记录。

2)根据请求自己在内存中生成零计数记录,并将该数据与RavenDB为非零计数返回的数据合并。查询所有用户,查询所有电影,创建基线零计数记录,然后对非零计数进行实际查询并在顶部进行分层。最后,应用分页/过滤/排序逻辑。

3)使用SQL复制包将用户,电影和MovieRental表复制到SQL并使用SQL进行此操作&#34;报告&#34;查询。