R中格式化JSON文件:带字符编码的词法错误

时间:2016-04-02 06:27:42

标签: json r

我有从JSON格式的服务器检索的数据。我现在想在R中预处理这些数据。

我的原始.json文件(如果在文本编辑器中打开)看起来像这样:

{"id": 1,"data": "{\"unid\":\"wU6993\",\"age\":\"21\",\"origin\":\"Netherlands\",\"biling\":\"2\",\"langs\":\"Dutch\",\"selfrating\":\"80\",\"selfarrest\":\"20\",\"condition\":1,\"fly\":\"2\",\"flytime\":0,\"purpose\":\"na\",\"destin\":\"Madrid\",\"txtQ1\":\"I\'m flying to Madrid to catch up with friends.\"}"}

我想将其解析回来以便进一步使用其预期的格式:

`{

"id": 1,

"data": {

  "unid": "wU6993",

  "age": "21",

  "origin": "Netherlands",

  "biling": "2",

  "langs": "Dutch",

  "selfrating": "80",

  "selfarrest": "20",

  "condition": 1,

  "fly": "2",

  "flytime": 0,

  "purpose": "na",

  "destin": "Madrid",

  "txtQ1": "I'm flying to Madrid to catch up with friends."

}

}`

使用jsonlite我根本无法阅读:

parsed = jsonlite::fromJSON(txt = 'exp1.json')
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) : 
  lexical error: inside a string, '\' occurs before a character which it may not.
          in\":\"Madrid\",\"txtQ1\":\"I\'m flying to Madrid to catch u
                     (right here) ------^

我认为错误告诉我某些字符应该转义。

如何解决此问题并阅读我的文件?

1 个答案:

答案 0 :(得分:2)

在定义"data"的嵌套大括号周围有额外的引号,其值实际上存储为一个巨大的字符串而不是有效的JSON。拿出来,

my_json <- '{"id": 1,"data": "{\"unid\":\"wU6993\",\"age\":\"21\",\"origin\":\"Netherlands\",\"biling\":\"2\",\"langs\":\"Dutch\",\"selfrating\":\"80\",\"selfarrest\":\"20\",\"condition\":1,\"fly\":\"2\",\"flytime\":0,\"purpose\":\"na\",\"destin\":\"Madrid\",\"txtQ1\":\"I\'m flying to Madrid to catch up with friends.\"}"}'

my_json <- sub('"\\{', '\\{', my_json)
my_json <- sub('\\}"', '\\}', my_json)

jsonlite::prettify(my_json)
# {
#     "id": 1,
#     "data": {
#         "unid": "wU6993",
#         "age": "21",
#         "origin": "Netherlands",
#         "biling": "2",
#         "langs": "Dutch",
#         "selfrating": "80",
#         "selfarrest": "20",
#         "condition": 1,
#         "fly": "2",
#         "flytime": 0,
#         "purpose": "na",
#         "destin": "Madrid",
#         "txtQ1": "I'm flying to Madrid to catch up with friends."
#     }
# }