Question

我的dynamoDB表中有以下数据。

这是我的代码：

const userStatusParams = {
        TableName: process.env.USERSTATUS_TABLE,
        KeyConditionExpression: "loggedIn = :loggedIn",
        ExpressionAttributeValues: {
          ":loggedIn": true
        }
      };
      var usersResult;
      try {
        usersResult = await dynamoDbLib.call("query", userStatusParams);
        console.log(usersResult);
      }catch (e) {
        console.log("Error occurred querying for users belong to group.");
        console.log(e);
      }

亚马逊返回此错误：

{ ValidationException: Query condition missed key schema element: userId
    at Request.extractError ...

如何让它返回所有记录，其中loggedIn == true？

我的数据库目前通过我的serverless.yml配置构建。

phoneNumberTable: #This table is used to track phone numbers used in the system.
      Type: AWS::DynamoDB::Table
      Properties:
        TableName: ${self:custom.phoneNumberTable}
        AttributeDefinitions: #UserID in this case will be created once and constantly updated as it changes with status regarding the user.
          - AttributeName: phoneNumber
            AttributeType: S
        KeySchema:
          - AttributeName: phoneNumber
            KeyType: HASH
        ProvisionedThroughput:
            ReadCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}}
            WriteCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}}

我通过其他答案对此进行了一些研究，但无法弄清楚我的情况。在其他答案中，他们有排序键，但我不在这里使用排序键。

Answer 1

如果您正在执行query，则必须传递主键userId。如果您没有primaryKey，并且您想要所有logged in = true字段，则可以scan filterExpression这样做

const userStatusParams = {
        TableName: process.env.USERSTATUS_TABLE,
        FilterExpression: 'loggedIn = :loggedIn',
        ExpressionAttributeValues: {
          ":loggedIn": true
        }
      };
      var usersResult;
      try {
        // Do scan
        usersResult = await dynamoDbLib.call("scan", userStatusParams);
        console.log(usersResult);
      }catch (e) {
        console.log("Error occurred querying for users belong to group.");
        console.log(e);
      }

更新：由于scan操作效率较低，解决此问题的另一种方法是使用主键GSI创建loggedIn。但这里的问题是你不能创建任何具有boolean data type.的字段主键。它必须是number, string, binary。因此，要创建gsi，您需要在loggedIn字段中存储已接受的数据类型，而不是boolean。

虽然我不确定它会对一千个记录表产生多大的性能影响但是gsi的好处是你可以创建它们later even on the existing table如果将来你发现了一些性能影响。此外，您可以在桌面上创建的gsi的数量仅限于5。因此明智地使用gsi。

Answer 2

扫描操作始终扫描整个表或二级索引，然后过滤掉值以提供所需的结果，实质上是添加了从结果集中删除数据的额外步骤。如果可能，请避免在大型表或索引上使用扫描操作，并使用可删除许多结果的过滤器。 Read more

您应该使用全球二级索引！

AWS控制台＆gt; DynamoDb＆gt;选项卡表的索引＆gt;创建索引＆gt;

primary key - loggedIn
secondary key - userId
projected attributes - all

我们应该添加二级密钥以拥有唯一的对。不要使用索引名称（loggedIn），因为loggedIn应该是唯一的。

您可以使用带有主键的Query方法（loggedIn）

Answer 3

为了查询dynamodb表，您只能查询索引字段。索引字段可以是：

主键
二级全局索引哈希

为了查询loggedIn记录，您需要在loggedIn字段中添加全局二级索引。

我不会在您的情况下使用您的数据，因为所有loggedIn记录都有2个值（true / false），除非您具有高容量，否则这将导致许多吞吐量错误。（由于分区上的哈希分布错误，你将永远拥有热键）

如果您仍想在dynamodb上使用此查询，则应更改“已登录”＆＃39;价值观，以防止'热'＆＃39;键

解决方案是为您的数据添加一些后缀：（true.0，true.1 ... true.N）

N应该是[此指数的预期分区+一些增长差距]（如果您期望高负载，或许多＆＃39;真/假＆＃39;那么您可以选择N = 200）（对于良好的哈希）在分区上分配）。我建议N将以userId为模。（这可以帮助您通过userId进行一些操作。）

如何使用非主键字段查询DynamoDB？

3 个答案: