我有从JSON格式的服务器检索的数据。我现在想在R中预处理这些数据。
我的原始.json文件(如果在文本编辑器中打开)看起来像这样:
{"id": 1,"data": "{\"unid\":\"wU6993\",\"age\":\"21\",\"origin\":\"Netherlands\",\"biling\":\"2\",\"langs\":\"Dutch\",\"selfrating\":\"80\",\"selfarrest\":\"20\",\"condition\":1,\"fly\":\"2\",\"flytime\":0,\"purpose\":\"na\",\"destin\":\"Madrid\",\"txtQ1\":\"I\'m flying to Madrid to catch up with friends.\"}"}
我想将其解析回来以便进一步使用其预期的格式:
`{
"id": 1,
"data": {
"unid": "wU6993",
"age": "21",
"origin": "Netherlands",
"biling": "2",
"langs": "Dutch",
"selfrating": "80",
"selfarrest": "20",
"condition": 1,
"fly": "2",
"flytime": 0,
"purpose": "na",
"destin": "Madrid",
"txtQ1": "I'm flying to Madrid to catch up with friends."
}
}`
使用jsonlite
我根本无法阅读:
parsed = jsonlite::fromJSON(txt = 'exp1.json')
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) : lexical error: inside a string, '\' occurs before a character which it may not. in\":\"Madrid\",\"txtQ1\":\"I\'m flying to Madrid to catch u (right here) ------^
我认为错误告诉我某些字符应该转义。
如何解决此问题并阅读我的文件?
答案 0 :(得分:2)
在定义"data"
的嵌套大括号周围有额外的引号,其值实际上存储为一个巨大的字符串而不是有效的JSON。拿出来,
my_json <- '{"id": 1,"data": "{\"unid\":\"wU6993\",\"age\":\"21\",\"origin\":\"Netherlands\",\"biling\":\"2\",\"langs\":\"Dutch\",\"selfrating\":\"80\",\"selfarrest\":\"20\",\"condition\":1,\"fly\":\"2\",\"flytime\":0,\"purpose\":\"na\",\"destin\":\"Madrid\",\"txtQ1\":\"I\'m flying to Madrid to catch up with friends.\"}"}'
my_json <- sub('"\\{', '\\{', my_json)
my_json <- sub('\\}"', '\\}', my_json)
jsonlite::prettify(my_json)
# {
# "id": 1,
# "data": {
# "unid": "wU6993",
# "age": "21",
# "origin": "Netherlands",
# "biling": "2",
# "langs": "Dutch",
# "selfrating": "80",
# "selfarrest": "20",
# "condition": 1,
# "fly": "2",
# "flytime": 0,
# "purpose": "na",
# "destin": "Madrid",
# "txtQ1": "I'm flying to Madrid to catch up with friends."
# }
# }