NIFI:如何用替换txt中的新行替换','

时间:2017-09-15 07:19:36

标签: json regex apache-nifi

我想在nifi处理器中解析json响应我有这样的json数据:

{
  "squadName": "Super hero squad",
  "homeTown": "Metro City",
  "formed": 2016,
  "secretBase": "Super tower",
  "active": true,
  "Data":{"row": [
    {
      "name": "Molecule Man",
      "age": 29,
      "secretIdentity": "Dan Jukes",
      "powers": [
        "Radiation resistance",
        "Turning tiny",
        "Radiation blast"
      ]
    },
    {
      "name": "Madame Uppercut",
      "age": 39,
      "secretIdentity": "Jane Wilson",
      "powers": [
        "Million tonne punch",
        "Damage resistance",
        "Superhuman reflexes"
      ]
    },
    {
      "name": "Eternal Flame",
      "age": 1000000,
      "secretIdentity": "Unknown",
      "powers": [
        "Immortality",
        "Heat Immunity",
        "Inferno",
        "Teleportation",
        "Interdimensional travel"
      ]
    }
  ]
}

我希望将其转换为以下格式:

  {"name": "Molecule Man", "age": 29,  "secretIdentity": "Dan Jukes", "powers":  ["Radiation resistance", "Turning tiny", "Radiation blast"]}
   {name": "Molecule Man", "age": 29,  "secretIdentity": "Dan Jukes", "powers":  ["Radiation resistance", "Turning tiny", "Radiation blast"]}
   {"name": "Molecule Man", "age": 29,  "secretIdentity": "Dan Jukes", "powers":  ["Radiation resistance", "Turning tiny", "Radiation blast"]}

我已经在evaluatejsonpath处理器中使用了这个表达式: $。数据['row'] 并且由于它我得到了行数据然后我在replacetext处理器中使用了另一个表达式: [] 来摆脱这个'[] '但我无法替换','用新行怎么办?

1 个答案:

答案 0 :(得分:1)

解决方案

如果您只想让每一行都在一行中,您只需删除所有不带前缀},的换行符即可。在你上一段描述的工作结束后说你最终得到了这样的结论:

{
  "name": "Molecule Man",
  "age": 29,
  "secretIdentity": "Dan Jukes",
  "powers": [
    "Radiation resistance",
    "Turning tiny",
    "Radiation blast"
  ]
},
{
  "name": "Madame Uppercut",
  "age": 39,
  "secretIdentity": "Jane Wilson",
  "powers": [
    "Million tonne punch",
    "Damage resistance",
    "Superhuman reflexes"
  ]
},
{
  "name": "Eternal Flame",
  "age": 1000000,
  "secretIdentity": "Unknown",
  "powers": [
    "Immortality",
    "Heat Immunity",
    "Inferno",
    "Teleportation",
    "Interdimensional travel"
  ]
}

现在,将(?<!},)\n替换为(将其留空,不是空格)。您可以在此处查看此更改:Link to Regex101.com

您还可以通过使用此替换将多个空格的所有出现更改为单个空格来删除多个空格:将(?<!},)\s+替换为_(单个空格,当然不是下划线)(demo { {3}}

它是如何运作的?

我把工作分成了两个阶段(你可以用一个正则表达式做到这一点,但为了简单起见,我进行了划分)。首先,我会查找文本中没有},之前的所有换行符,因为这些换行符不是我们要删除的换行符。

删除后,我们几乎得到了我们想要的东西 - 但由于多个空格和破碎的格式,它很难看。所以我再次搜索所有空格字符(再次排除},行,因为换行也是一个空白字符),然后用一个空格的单个出现来更改所有多个出现。

最终结果:

{ "name": "Molecule Man", "age": 29, "secretIdentity": "Dan Jukes", "powers": [ "Radiation resistance", "Turning tiny", "Radiation blast" ]},
{ "name": "Madame Uppercut", "age": 39, "secretIdentity": "Jane Wilson", "powers": [ "Million tonne punch", "Damage resistance", "Superhuman reflexes" ]},
{ "name": "Eternal Flame", "age": 1000000, "secretIdentity": "Unknown", "powers": [ "Immortality", "Heat Immunity", "Inferno", "Teleportation", "Interdimensional travel" ]}