如何读取一个文件中包含的多个JSON结构?

时间:2016-09-08 21:35:39

标签: json r jsonlite

我有一个带有此结构的.txt文件

section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"},...etc...}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]
...
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]

我试图通过使用R命令

来阅读它
library(jsonlite)
data <- fromJSON("myfile.txt")

但我得到了这个

Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) : 
  lexical error: invalid char in json text.
                                       section2#[{"p": "0.99
                     (right here) ------^

即使按部分拆分,我怎样才能阅读?

2 个答案:

答案 0 :(得分:4)

删除前缀并将展平的JSON数组绑定到一个数据框中:

raw_dat <- readLines(textConnection('section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]'))

library(stringi)
library(purrr)
library(jsonlite)

stri_replace_first_regex(raw_dat, "^section[[:digit:]]+#", "") %>% 
  map_df(fromJSON)
##          p tag
## 1 0.999834  MA
## 2        1  MO
## 3   0.9995  NC
## 4        1  FL
## 5   0.9995  NC
## 6        1  FL

答案 1 :(得分:1)

从每一行中删除section#。然后你的 .txt 将在每个索引处都有一个带有JSON对象的2D数组。 您可以通过访问元素来访问元素,因为foo[0][0]是第一行的第一个对象,而foo[m][n] mnumber of sections -1n是{{1} }