Question

我似乎无法做到这一点。我想扫描一个表，只返回特定字段不存在的记录。

我尝试了以下两件事：

HashMap<String, Condition> scanFilter = new HashMap();
Condition scanFilterCondition = new Condition().withComparisonOperator(ComparisonOperator.NULL.toString());
scanFilter.put("field", scanFilterCondition);

ScanRequest scan = new ScanRequest()
    .withTableName("table name")
    .withScanFilter(scanFilter)
etc

和

ScanRequest scan = new ScanRequest()
          .withTableName("table")
          .withFilterExpression("attribute_not_exists(attributeName)")
          .withLimit(100)
          etc

然而，他们没有返回任何记录（并且大多数记录都缺少此字段）。请注意，如果我删除过滤器，扫描会返回并按预期处理所有记录，因此基本查询是正确的。我该怎么做？

EDIT添加了完整的方法以防万一

// Get information on the table so that we can set the read capacity for the operation.
List<String> tables = client.listTables().getTableNames();
String tableName = tables.stream().filter(table -> table.equals(configuration.getTableName())).findFirst().get();
if(Strings.isNullOrEmpty(tableName))
  return 0;
TableDescription table = client.describeTable(tableName).getTable();

//Set the rate limit to a third of the provisioned read capacity.
int rateLimit = (int) (table.getProvisionedThroughput().getReadCapacityUnits() / 3);
RateLimiter rateLimiter = RateLimiter.create(rateLimit);
// Track how much throughput we consume on each page
int permitsToConsume = 1;
// Initialize the pagination token
Map<String, AttributeValue> exclusiveStartKey = null;
int count = 1;
int writtenCount = 0;

do {
  // Let the rate limiter wait until our desired throughput "recharges"
  rateLimiter.acquire(permitsToConsume);

  //We only want to process records that don't have the field key set.
  HashMap<String, Condition> scanFilter = new HashMap<>();
  Condition scanFilterCondition = new Condition().withComparisonOperator(ComparisonOperator.NULL.toString());
  scanFilter.put("field", scanFilterCondition);

  ScanRequest scan = new ScanRequest()
      .withTableName(configuration.getNotificationsTableName())
      .withScanFilter(scanFilter)
      .withLimit(100)
      .withReturnConsumedCapacity(ReturnConsumedCapacity.TOTAL)
      .withExclusiveStartKey(exclusiveStartKey);

  ScanResult result = client.scan(scan);
  exclusiveStartKey = result.getLastEvaluatedKey();

  // Account for the rest of the throughput we consumed,
  // now that we know how much that scan request cost
  double consumedCapacity = result.getConsumedCapacity().getCapacityUnits();
  permitsToConsume = (int)(consumedCapacity - 1.0);
  if(permitsToConsume <= 0) {
    permitsToConsume = 1;
    }

  // Process results here
} while (exclusiveStartKey != null);

Answer 1

NULL 条件似乎没问题。您需要使用扫描进行递归搜索。 Dynamodb扫描不会一次扫描整个数据库。它根据消耗的预配置吞吐量扫描数据。

基于LastEvaluatedKey执行循环扫描的示例代码： -

ScanResult result = null;

            do {
                HashMap<String, Condition> scanFilter = new HashMap<>();
                Condition scanFilterCondition = new Condition().withComparisonOperator(ComparisonOperator.NULL);
                scanFilter.put("title", scanFilterCondition);

                ScanRequest scanRequest = new ScanRequest().withTableName(tableName).withScanFilter(scanFilter);
                if (result != null) {
                    scanRequest.setExclusiveStartKey(result.getLastEvaluatedKey());
                }

                result = dynamoDBClient.scan(scanRequest);

                LOGGER.info("Number of records ==============>" + result.getItems().size());

                for (Map<String, AttributeValue> item : result.getItems()) {
                    LOGGER.info("Movies ==================>" + item.get("title"));
                }
            } while (result.getLastEvaluatedKey() != null);

NULL ：该属性不存在。所有数据都支持NULL   类型，包括列表和地图。注意此运算符测试   不存在属性，而不是其数据类型。如果是数据类型   属性＆＃34; a＆＃34;为null，并使用NULL计算它，结果是a   布尔值false。这是因为属性＆＃34; a＆＃34;存在;它的数据类型   与NULL比较运算符无关。

LastEvaluatedKey 操作所在项的主键   停止，包括先前的结果集。使用此值开始   新操作，在新请求中排除此值。

如果LastEvaluatedKey为空，那么＆＃34;最后一页＆＃34;结果一直是   已处理，无法检索更多数据。

如果LastEvaluatedKey不为空，则不一定意味着   结果集中有更多数据。了解你的唯一方法   已到达结果集的末尾是LastEvaluatedKey的时间   空。

Answer 2

var expr = new Expression（）; expr.ExpressionStatement =“包含（#Name，：Name）和attribute_not_exists（#FullName）或#FullName =：FullName”; expr.ExpressionAttributeNames [“＃FullName”] =“ FullName”; expr.ExpressionAttributeValues [“：FullName”] =“ suming singh”;

            expr.ExpressionAttributeNames["#Name"] = "Name";
            expr.ExpressionAttributeValues[":Name"] = "sumit singh";
            ScanOperationConfig config = new ScanOperationConfig()


            {
                Limit = 2,
                PaginationToken = "{}",
                //Filter = filter,
                FilterExpression = expr,
                    //AttributesToGet = attributesToGet,
                    //Select = SelectValues.SpecificAttributes,
                TotalSegments = 1
            };
            var item = _tableContext.FromScanTableAsync(config);
            do
            {
                documentList.AddRange(await item.GetNextSetAsync());

            } while (!item.IsDone);

dynamodb scan：过滤属性不存在的所有记录

2 个答案: