我在NiFi 1.4中有一个 ConvertJsontoAvro 处理器,并且很难在avro中获得正确的十进制数据类型。使用 ExecuteSQL 处理器中的逻辑Avro数据类型将数据转换为字节,使用 ConvertAvrotoJSON 处理器将avro转换为Json,然后使用 ConvertJsonToAvro 处理器使用 PutParquet 进入HDFS。
我的架构是:
TOKEN=$(az account get-access-token | jq ".accessToken" | tr -d '"') if [ -z "$LOCATION" ]; then export LOCATION="NorthEurope" fi echo "INFO [cosmos]: Using location: $LOCATION" echo "INFO [cosmos]: Creating bookmarks DB" BM_ACCOUNT="name-of-your-bookmark-db" az cosmosdb create --resource-group $RESOURCE_GROUP \ --name $BM_ACCOUNT \ --kind MongoDB \ --locations "$LOCATION=0" curl -X PATCH \ -H "Authorization: Bearer ${TOKEN}" \ -H 'Content-Type: application/json' \ --data '{"properties":{"capabilities":[{"name":"EnableAggregationPipeline","description":null}]}}' \ "https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.DocumentDb/databaseAccounts/${BM_ACCOUNT}/?api-version=2015-04-08" WAIT_FOR=12 SUCCESS=0 while [ $WAIT_FOR -gt 0 ]; do sleep 10 RESULT=$(curl -H "Authorization: Bearer ${TOKEN}" \ -H 'Content-Type: application/json' \ "https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.DocumentDb/databaseAccounts/${BM_ACCOUNT}/?api-version=2015-04-08" \ | jq ".properties.capabilities[].name" \ | tr -d '"') if [ "$RESULT" == "EnableAggregationPipeline" ]; then SUCCESS=1 break; fi echo "INFO [cosmos]: Waiting another ${WAIT_FOR} tries for 'EnableAggregationPipeline' capability..." ((WAIT_FOR--)) done if [ $SUCCESS -eq 0 ]; then echo "ERROR [cosmos]: Did not get required CosmosDB capability of 'EnableAggregationPipeline' in time for account ${BM_ACCOUNT} - giving up." >&2 exit 1 fi
我的JSON:
{ "type" : "record", "name" : "schema", "fields" : [ { "name" : "entryDate", "type" : [ "null", { "type" : "long", "logicalType" : "timestamp-micros" } ], "default" : null }, { "name" : "points", "type" : [ "null", { "type" : "bytes", "logicalType" : "decimal", "precision" : 18, "scale" : 6 } ], "default" : null }] }
我对avro说错误
{
"entryDate" : 2018-01-26T13:48:22.087,
"points" : 6.000000
}
是否有某种类型的工作?...
答案 0 :(得分:0)
由于Avro中的错误,当前您不能混合使用null类型和逻辑类型。检查此仍未解决的问题: https://issues.apache.org/jira/browse/AVRO-1891
此外,默认值不能为null。这应该为您工作:
{
"type" : "record",
"name" : "schema",
"fields" : [ {
"name" : "entryDate",
"type" : {
"type" : "long",
"logicalType" : "timestamp-micros"
},
"default" : 0
}, {
"name" : "points",
"type" : {
"type" : "bytes",
"logicalType" : "decimal",
"precision" : 18,
"scale" : 6
},
"default" : ""
}]
}