扫描具有受限读取或吞吐量限制的dynamoDB表

时间:2013-01-13 03:54:16

标签: nosql amazon-dynamodb

有没有人有一个示例java代码在dynamoDB表上执行扫描操作,其中扫描操作仅使用一定百分比的吞吐量限制? 提前致谢。

1 个答案:

答案 0 :(得分:4)

昨天我们在Rate Limited Scans in Amazon DynamoDB上发布了一篇关于如何AWS Java Developer Blog的博客文章。我不确定您使用的是哪种编程语言,但如果您使用的是Java,那么使用GoogleGuava RateLimiter类的这种方法可能对您有用。但格雷格先前的答复也是正确的。如果您正在使用Amazon Elastic Map Reduce,则DynamoDB插件支持configurable read and write throughput percent以限制扫描您的表时自身。 DynamoDB的Amazon Redshift integration也有此设置。

以下是博客文章中的一个片段,其中显示了如何执行分页扫描,使用RateLimiter和AWS SDK for Java来限制自己每秒消耗25个读取容量单位:

// Initialize the rate limiter to allow 25 read capacity units / sec
RateLimiter rateLimiter = RateLimiter.create(25.0);

// Track how much throughput we consume on each page
int permitsToConsume = 1;

// Initialize the pagination token
Map<String, AttributeValue> exclusiveStartKey = null;

do {
    // Let the rate limiter wait until our desired throughput "recharges"
    rateLimiter.acquire(permitsToConsume);

    // Do the scan
    ScanRequest scan = new ScanRequest()
        .withTableName("ProductCatalog")
        .withLimit(100)
        .withReturnConsumedCapacity(ReturnConsumedCapacity.TOTAL)
        .withExclusiveStartKey(exclusiveStartKey);
    ScanResult result = dynamodb.scan(scan);
    exclusiveStartKey = result.getLastEvaluatedKey();

    // Account for the rest of the throughput we consumed,
    // now that we know how much that scan request cost
    double consumedCapacity = result.getConsumedCapacity().getCapacityUnits();
    permitsToConsume = (int)(consumedCapacity - 1.0);
    if(permitsToConsume <= 0) {
        permitsToConsume = 1;
    }

    // Process results here
    processYourResults(result);

} while (exclusiveStartKey  != null);