如何在Mongoose / MongoDB中有效地计算一到一万亿的关系? (例如"喜欢"计算推特上的推文)

时间:2018-02-01 03:42:00

标签: mongodb mongoose

让我们想象一下,我只是为了一个简单的例子(并使用Mongoose / MongoDB)构建一个类似于Twitter的应用程序。

我有一个收集"推文"我的问题是:如何管理"像计数"一条推文没有给数据库带来不必要的压力?

我的第一直觉是另一个名为"喜欢"每个文档都会存储喜欢推文的用户的ID,以及他们喜欢的推文的ID。

但后来我才意识到,如果我想在前端展示20条推文,那将会带给我21个查询(这是我认为我误解了一些基本的东西,它不应该带我这么多查询)。一个查询找到20条最新的推文,另一个查询 - 每条推文计算有多少相关的"喜欢"文件有它。在MongoDB中有更有效的处理方式吗?或者这是我需要在我的应用程序中转向某种缓存解决方案的?

我的下一个想法是嵌入一个" usersWhoHaveLiked"每个推文文档中的数组如下:

{
    _id: ObjectId("abc123abc123"),
    title: "My first tweet",
    author: 3,
    usersWhoHaveLiked: [3, 20, 17, 5]
}

但是,如果成千上万的用户可以"喜欢"一个推文,该数组可能变得非常大,我担心修改那个大小的数组可能是CPU昂贵/慢,或者直接溢出每个文件允许的16mb。

我意识到有很多不同的方法来构建这个解决方案,所以我不寻找最佳方式,我知道这种方式非常主观。 ..这个问题至少有点客观的原因是我们希望尽量减少对数据库和数据库的压力。服务器;哪个 可衡量。

我是一名数据库新手,所以如果有一种Mongoose / MongoDB风格的处理方式,请随意指出其他人可能会非常明显的事情:)

谢谢!

1 个答案:

答案 0 :(得分:2)

Referring to the three types of references stated by Mongo's blog on this topic:

One-to-Few

Generally less than a few hundred items but other factors do have an impact.

A data object for your example might look like:

{
    _id: ObjectId("abc123abc123"),
    title: "My first tweet",
    author: 3,
    usersWhoHaveLiked: [
        { name: 'Foo' }
        { name: 'Bar' }
    ]
}

To get the tweet and like count would be one query to mongo and then getting the length of the usersWhoHaveLiked array:

Tweets.findById('abc123abc123').exec().then((tweet) => {
    const likeCount = tweet.usersWhoHaveLiked.length;
    // do something with tweet and likeCount
});

One-to-Many

Generally "up to several hundred [items], but never more than a couple thousand or so".

A data object for your example might look like:

{
    _id: ObjectId("abc123abc123"),
    title: "My first tweet",
    author: 3,
    usersWhoHaveLiked: [3, 20, 17, 5]
}

To get the tweet and like count would be the same as one-to-few:

Tweets.findById('abc123abc123').exec().then((tweet) => {
    const likeCount = tweet.usersWhoHaveLiked.length;
    // do something with tweet and likeCount
});

One-to-Squillions

Generally "more than a couple thousand or so".

A data object for your example might look like:

// tweet
{
    _id: ObjectId("abc123abc123"),
    title: "My first tweet",
    author: 3
}

// likes
{
    _id: ObjectId("abc123abc124"),
    tweet: ObjectId("abc123abc123"),
    author: 4 // or could be embedded info as well or a mix
}

To get the tweet and like count would be two queries:

Promise.all([
    Tweets.findById('abc123abc123').exec(),
    Likes.count({ tweet: 'abc123abc123' }).exec()
]).then(([tweet, likeCount]) => {
    // do something with tweet and likeCount
});

There are some ways to simplify this and I will leave them up to you to explore:

  • In the first two examples, create a virtual getter that will get the array length for you (i.e. tweet.likeCount)
  • For the last example, create a post save hook from likes that will update a property on tweets (e.g. likeCount).

A final note regarding when to use which of the three strategies depends on more than just the number of items. A couple other key concerns are if the data needs to stand on it's own and the velocity of change of the array.