每个流文件包含2000条记录。我想将01/01/2000解析为一列year = 2000,一月month = Jan和一列day = 01
即将输入列01/01/2000转换为3个值,并以逗号01,Jan,2000
答案 0 :(得分:4)
让我们说,对于一个有生日的人,您具有这样的模式,并且您想分割生日:
{
"name": "person",
"namespace": "nifi",
"type": "record",
"fields": [
{ "name": "first_name", "type": "string" },
{ "name": "last_name", "type": "string" },
{ "name": "birthday", "type": "string" }
]
}
您需要修改架构,使其具有要添加的字段:
{
"name": "person",
"namespace": "nifi",
"type": "record",
"fields": [
{ "name": "first_name", "type": "string" },
{ "name": "last_name", "type": "string" },
{ "name": "birthday", "type": "string" },
{ "name": "birthday_year", "type": ["null", "string"] },
{ "name": "birthday_month", "type": ["null", "string"] },
{ "name": "birthday_day", "type": ["null", "string"] }
]
}
让我们说输入记录包含以下文本:
bryan,bende,1980-01-01
我们可以将UpdateRecord与CsvReader和CsvWriter一起使用,并且UpdateRecord可以通过解析原始的生日字段来填充我们想要的三个字段。
如果将输出发送到LogAttribute,我们现在应该看到以下内容:
first_name,last_name,birthday,birthday_year,birthday_month,birthday_day
bryan,bende,1980-01-01,1980,01,01
以下是指向记录路径指南的链接,以获取有关toDate和格式功能的详细信息:
https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
答案 1 :(得分:3)
您可以为此使用UpdateRecord,假设您的输入记录的日期列名为“ myDate”,请将Replacement Value Strategy
设置为Record Path Value
,并且用户定义的属性可能类似于:
/day format(/myDate, "dd")
/month format(/myDate, "MMM")
/year format(/myDate, "YYYY")
您的输出模式如下:
{
"namespace": "nifi",
"name": "myRecord",
"type": "record",
"fields": [
{"name": "day","type": "int"},
{"name": "month","type": "string"},
{"name": "year","type": "int"}
]
}