Question

我需要检查DynamoDB中特定表中的所有项目。

我的桌子包含1000万件物品。我试图获取所有，我不能将它们插入列表，因为它太大了。我的目的是检查所有项目，看看我是否可以删除它们。

Answer 1

以下是扫描表示例代码示例。我不确定你是否有这个代码。

Scan API无法一次性为您提供所有记录。您必须递归执行扫描，直到 LastEvaluatedKey不为空才能获取表中的所有项目。您可以想象这与分页输出类似。这样您就不需要在一次扫描中处理所有项目（即1000万项）。此外，您也不会花费（即读取容量单位）。

如果扫描的项目总数超过最大数据集大小限制为1 MB，扫描停止，结果返回给用户一个LastEvaluatedKey值，用于在后续步骤中继续扫描操作。结果还包括超过的项目数量限制。扫描可能导致没有符合过滤条件的表数据。

Scan API

public class ScanTable {

    public static void main(String[] args) {

        AmazonDynamoDB amazonDynamoDB = AmazonDynamoDBClientBuilder.standard()
                .withEndpointConfiguration(new EndpointConfiguration("http://localhost:8000", "us-east-1")).build();

        ScanRequest scanRequest = new ScanRequest().withTableName("Movies");

        Map<String, AttributeValue> lastKey = null;

        do {

            ScanResult scanResult = amazonDynamoDB.scan(scanRequest);

            List<Map<String, AttributeValue>> results = scanResult.getItems();

            // You can get the results here
            results.stream().forEach(System.out::println);

            lastKey = scanResult.getLastEvaluatedKey();
            scanRequest.setExclusiveStartKey(lastKey);
        } while (lastKey != null);

    }
}

不清楚： -

我知道您想要检索所有项目并进行一些处理。但是，我不确定您为什么要插入列表。

如果单独处理每个扫描结果（即1MB数据），则可能不需要插入列表并使用堆内存。显然，无论采用何种方法，都需要更多的记忆。

如何在DynamoDB中获取大数据？

1 个答案: