Question

我需要一个应具有以下属性的表：userId，commentId，commentTopic，commentCountry。我想避免使用Secondary Global索引。什么应该是最好的设计，这将允许我以下操作：

1) Get all comments belonging to a userId.
2) Get a comment with commentID = "commentID"
3) Get comments of all users  where topic = "commentTopic" and country = "commentCountry"

我在想：

1) userId#commentId as partitionKey which will uniformly distribute the load and allow me to perform operation 1 and 2 from above list.
2) commentCountry#commentTopic as RangeKey which will allow me to perform operation 3 efficiently.

这种方法有任何缺点吗？这个表设计会有效吗？将多个密钥连接在一起有什么缺点？

Answer 1

这里有一个你可以使用的想法。您可以使用以下属性定义一个表注释：

UserId （字符串） - 分区键
CommentId （字符串） - 排序键

定义以下GSI：

CommentId （字符串） - GSI分区键

以下GSI：

CommentTopicAndCommentCountry （字符串，GSI分区键） - 将两个值组合在一起的组合字段，类似“123_Germany”
CommentId （字符串，GSI排序键）与之前完全相同的字段

现在您的查询：

获取属于userId的所有评论

这非常简单，只需使用分区键/排序键并仅指定分区键。

使用commentID =“commentID”
发表评论

使用第一个GSI并提供评论ID。

获取所有用户的评论，其中topic =“commentTopic”和country =“commentCountry”

使用第二个GSI，并将“comment id和country”的值作为单个分区键值提供。

现在的限制是，对于单个分区密钥，您最多只能有10GB的数据，因此，如果这可能最终导致用户写入的消息超过10GB或者超过10GB的消息在单个主题中，您可能希望使用多个表而不是单个表。

您可以评论_2017 ，评论_2016 等来存储不同年份的评论。如果你愿意，你可以更细化。

这样做的另一个好处是您可以为不同的表设置不同的RCU / WCU值。我想可以更频繁地阅读较新的注释，因此您可以为最新的表设置高RCU / WCI，但旧的注释根本不会写入，也可能不经常阅读。在这种情况下，您可以在那里设置更少的RCU。

Answer 2

PK = USER#userid 
SK = COMMENT#commentId(GSIPK)
GSISK = commentTopic#country

获取用户的所有评论

PK = USER#userid AND SK START_WITH COMMENT#

通过评论ID获取评论可能违反GSI

GSIPK = COMMENT#commentId

获取所有用户的评论，其中topic =“ commentTopic”和country =“ commentCountry”

GSIPK START_WITH COMMENT# AND GSISK = commentTopic#country

Dynamodb表设计

2 个答案: