来自Kafka的MongoSource连接创建奇怪的_data键

时间:2019-12-20 21:00:30

标签: mongodb apache-kafka apache-kafka-connect mongodb-kafka-connector

我使用的KafkaConnect-MongoSource具有以下配置:

curl -X PUT http://localhost:8083/connectors/mongo-source2/config -H "Content-Type: application/json" -d '{
  "name":"mongo-source2",
  "tasks.max":1,
  "connector.class":"com.mongodb.kafka.connect.MongoSourceConnector",
  "key.converter":"org.apache.kafka.connect.storage.StringConverter",
  "value.converter":"org.apache.kafka.connect.storage.StringConverter",
  "connection.uri":"mongodb://xxx:xxx@localhost:27017/mydb",
  "database":"mydb",
  "collection":"claimmappingrules.66667777-8888-9999-0000-666677770000",
  "pipeline":"[{\"$addFields\": {\"something\":\"xxxx\"} }]",
  "transforms":"dropTopicPrefix",
  "transforms.dropTopicPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
  "transforms.dropTopicPrefix.regex":".*",
  "transforms.dropTopicPrefix.replacement":"my-topic"
}'

由于某种原因,当我使用消息时,我得到了一个奇怪的密钥:

 "_id": {
"_data": "825DFD2A53000000012B022C0100296E5A1004060C0FB7484A4990A7363EF5F662CF8D465A5F6964005A1003F9974744D06AFB498EF8D78370B0CD440004"
  }

我不知道它是从哪里来的,我的mongo文档的_id是UUID。当我使用消息时,应该在使用者密钥上看到documentKey字段。

以下是连接器发布到kafka中的消息示例:

{
  "_id": {
    "_data": "825DFD2A53000000012B022C0100296E5A1004060C0FB7484A4990A7363EF5F662CF8D465A5F6964005A1003F9974744D06AFB498EF8D78370B0CD440004"
  },
  "operationType": "replace",
  "clusterTime": {
    "$timestamp": {
      "t": 1576872531,
      "i": 1
    }
  },
  "fullDocument": {
    "_id": {
      "$binary": "+ZdHRNBq+0mO+NeDcLDNRA==",
      "$type": "03"
    },
    ...
  },
  "ns": {
    "db": "security",
    "coll": "users"
  },
  "documentKey": {
    "_id": {
      "$binary": "+ZdHRNBq+0mO+NeDcLDNRA==",
      "$type": "03"
    }
  }
}

1 个答案:

答案 0 :(得分:0)

与 Kafka 连接配置架构相关的文档非常有限。我知道现在回复已经太晚了,但最近我也遇到了同样的问题,并通过反复试验找到了解决方案。

我在我的mongodb-kafka-connect配置中添加了这两个配置-

public static void exportToExcel(TableView<T> tableView, Stage stage) {

    HSSFWorkbook hssfWorkbook = new HSSFWorkbook();
    HSSFSheet hssfSheet = hssfWorkbook.createSheet("Sheet1");
    HSSFRow firstRow = hssfSheet.createRow(0);

    //Getting column width
    ExcelPropertiesCustomization.getColumnWidth(hssfSheet);
 
    // Getting title properties
    CellStyle naslov = ExcelPropertiesCustomization.getTitleProperties(hssfWorkbook);
    //set titles of columns
    for (int i = 0; i < tableView.getColumns().size(); i++) {
        firstRow.createCell(i).setCellValue(tableView.getColumns().get(i).getText());
        firstRow.getCell(i).setCellStyle(naslov);
    }
    // set cells for rest of the table
    for (int i = 0; i < tableView.getItems().size(); i++) {
        HSSFRow hssfRow = hssfSheet.createRow(i + 1);
        for (int col = 0; col < tableView.getColumns().size(); col++) {
            Object celValue = tableView.getColumns().get(col).getCellObservableValue(i).getValue();
            try {
                if (celValue != null) {
                    hssfRow.createCell(col).setCellValue(Double.parseDouble(celValue.toString()));
                }
            } catch (NumberFormatException e) {
                hssfRow.createCell(col).setCellValue(celValue.toString());
            } catch (NullPointerException e) {
                System.out.println(e);
            }
        }
    }

    //save excel file and close the workbook
    try {
        File file = new File("FXdatabase.xls");
        FileOutputStream out = new FileOutputStream(file);
        hssfWorkbook.write(out);
        FileChooser fileChooser = new FileChooser();
        FileChooser.ExtensionFilter extFilter = new FileChooser.ExtensionFilter("XLS files (*.xls)", "*.xls");
        fileChooser.getExtensionFilters().add(extFilter);
        File dest = fileChooser.showSaveDialog(stage);
        if (dest != null) {
            try {
                Files.copy(file.toPath(), dest.toPath());
            } catch (IOException ex) {
                System.err.println(ex);
            }
        }
        out.close();
    } catch (IOException e) {
        System.out.prinln(e);
    }
}

但即使在此之后,我仍然不知道作为 kafka 分区分配键的更改流的 resume_token 是否在性能方面有任何意义,甚至对于由于长时间不活动而导致 resume_token 过期的情况。

附言- 我以 mongodb 作为源的 kafka 连接配置的最终版本是这样的 -

IOException