Apache NiFi不转换识别convertJsontoAvro处理器中的十进制类型

时间:2018-01-29 18:50:03

标签: avro apache-nifi parquet

我在NiFi 1.4中有一个 ConvertJsontoAvro 处理器,并且很难在avro中获得正确的十进制数据类型。使用 ExecuteSQL 处理器中的逻辑Avro数据类型将数据转换为字节,使用 ConvertAvrotoJSON 处理器将avro转换为Json,然后使用 ConvertJsonToAvro 处理器使用 PutParquet 进入HDFS。

我的架构是:

  

TOKEN=$(az account get-access-token | jq ".accessToken" | tr -d '"') if [ -z "$LOCATION" ]; then export LOCATION="NorthEurope" fi echo "INFO [cosmos]: Using location: $LOCATION" echo "INFO [cosmos]: Creating bookmarks DB" BM_ACCOUNT="name-of-your-bookmark-db" az cosmosdb create --resource-group $RESOURCE_GROUP \ --name $BM_ACCOUNT \ --kind MongoDB \ --locations "$LOCATION=0" curl -X PATCH \ -H "Authorization: Bearer ${TOKEN}" \ -H 'Content-Type: application/json' \ --data '{"properties":{"capabilities":[{"name":"EnableAggregationPipeline","description":null}]}}' \ "https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.DocumentDb/databaseAccounts/${BM_ACCOUNT}/?api-version=2015-04-08" WAIT_FOR=12 SUCCESS=0 while [ $WAIT_FOR -gt 0 ]; do sleep 10 RESULT=$(curl -H "Authorization: Bearer ${TOKEN}" \ -H 'Content-Type: application/json' \ "https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.DocumentDb/databaseAccounts/${BM_ACCOUNT}/?api-version=2015-04-08" \ | jq ".properties.capabilities[].name" \ | tr -d '"') if [ "$RESULT" == "EnableAggregationPipeline" ]; then SUCCESS=1 break; fi echo "INFO [cosmos]: Waiting another ${WAIT_FOR} tries for 'EnableAggregationPipeline' capability..." ((WAIT_FOR--)) done if [ $SUCCESS -eq 0 ]; then echo "ERROR [cosmos]: Did not get required CosmosDB capability of 'EnableAggregationPipeline' in time for account ${BM_ACCOUNT} - giving up." >&2 exit 1 fi

我的JSON:

  

{ "type" : "record", "name" : "schema", "fields" : [ { "name" : "entryDate", "type" : [ "null", { "type" : "long", "logicalType" : "timestamp-micros" } ], "default" : null }, { "name" : "points", "type" : [ "null", { "type" : "bytes", "logicalType" : "decimal", "precision" : 18, "scale" : 6 } ], "default" : null }] }

我对avro说错误

{ "entryDate" : 2018-01-26T13:48:22.087, "points" : 6.000000 }

是否有某种类型的工作?...

1 个答案:

答案 0 :(得分:0)

由于Avro中的错误,当前您不能混合使用null类型和逻辑类型。检查此仍未解决的问题: https://issues.apache.org/jira/browse/AVRO-1891

此外,默认值不能为null。这应该为您工作:

{
    "type" : "record",
    "name" : "schema",
    "fields" : [ {
      "name" : "entryDate",
      "type" : {
        "type" : "long",
        "logicalType" : "timestamp-micros"
      },
      "default" : 0
    }, {
      "name" : "points",
      "type" : {
        "type" : "bytes",
        "logicalType" : "decimal",
        "precision" : 18,
        "scale" : 6
      },
      "default" : ""
    }]
  }