Question

我有一个带有此结构的.txt文件

section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"},...etc...}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]
...
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]

我试图通过使用R命令

来阅读它

library(jsonlite)
data <- fromJSON("myfile.txt")

但我得到了这个

Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) : 
  lexical error: invalid char in json text.
                                       section2#[{"p": "0.99
                     (right here) ------^

即使按部分拆分，我怎样才能阅读？

Answer 1

删除前缀并将展平的JSON数组绑定到一个数据框中：

raw_dat <- readLines(textConnection('section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]'))

library(stringi)
library(purrr)
library(jsonlite)

stri_replace_first_regex(raw_dat, "^section[[:digit:]]+#", "") %>% 
  map_df(fromJSON)
##          p tag
## 1 0.999834  MA
## 2        1  MO
## 3   0.9995  NC
## 4        1  FL
## 5   0.9995  NC
## 6        1  FL

Answer 2

从每一行中删除section#。然后你的 .txt 将在每个索引处都有一个带有JSON对象的2D数组。您可以通过访问元素来访问元素，因为foo[0][0]是第一行的第一个对象，而foo[m][n] m是number of sections -1而n是{{1} }

如何读取一个文件中包含的多个JSON结构？

2 个答案: