如何在R中打印JSON文件的前十行

时间:2016-11-25 16:31:05

标签: r jsonlite

我有一个R数据帧并使用jsonlite包将其转换为json格式

jsonData <- toJSON(dataset)

我想验证转换是否成功,数据集非常大。如何在json文件中打印前5行?

1 个答案:

答案 0 :(得分:1)

将R对象转换为JSON

有多个软件包实现了toJSON()功能,例如jsonliterjsonRJSONIO。我将在下面使用jsonlite。该解决方案也适用于RJSONIO,但不适用于rjson

我还没有找到直接从JSON字符串中打印一行子集的方法。原因是所有三个包都返回一个字符(即长度为1的字符向量)而不是字符向量,其中JSON字符串的每一行都占用一个元素:

length(jsonlite::toJSON(mtcars))

实际上,转换的对象是一长串文本:

jsonlite::toJSON(mtcars)
## [{"mpg":21,"cyl":6,"disp":160,"hp":110,"drat":3.9,"wt":2.62,"qsec":16.46,"vs":0,"am":1,"gear":4,"carb":4,"_row":"Mazda RX4"},{"mpg":21,"cyl":6,"disp":160,"hp":110,"drat":3.9,"wt":2.875,"qsec":17.02,"vs":0,"am":1,"gear":4,"carb":4,"_row":"Mazda RX4 Wag"},{"mpg":22.8,"cyl":4,"disp":108,"hp":93,"drat":3.85,"wt":2.32,"qsec":18.61,"vs":1,"am":1,"gear":4,"carb":1,"_row":"Datsun 710"},{"mpg":21.4,"cyl":6,"disp":258,"hp":110,"drat":3.08,"wt":3.215,"qsec":19.44,"vs":1,"am":0,"gear":3,"carb":1,"_row":"Hornet 4 Drive"},{"mpg":18.7,"cyl":8,"disp":360,"hp":175,"drat":3.15,"wt":3.44,"qsec":17.02,"vs":0,"am":0,"gear":3,"carb":2,"_row":"Hornet Sportabout"},{"mpg":18.1,"cyl":6,"disp":225,"hp":105,"drat":2.76,"wt":3.46,"qsec":20.22,"vs":1,"am":0,"gear":3,"carb":1,"_row":"Valiant"},{"mpg":14.3,"cyl":8,"disp":360,"hp":245,"drat":3.21,"wt":3.57,"qsec":15.84,"vs":0,"am":0,"gear":3,"carb":4,"_row":"Duster 360"},{"mpg":24.4,"cyl":4,"disp":146.7,"hp":62,"drat":3.69,"wt":3.19,"qsec":20,"vs":1,"am":0,"gear":4,"c... <truncated>

由于只有一行,你只需打印前几行即可获得任何收益。

但是来自toJSON的函数jsonlite(以及来自RJSONIO的函数)允许您将JSON字符串分解为行(我手动截断输出,因为它也是长):

jsonlite::toJSON(mtcars, pretty = TRUE)
## [
##   {
##     "mpg": 21,
##     "cyl": 6,
##     "disp": 160,
##     "hp": 110,
##     "drat": 3.9,
##     "wt": 2.62,
## ...

它仍然是长度为1的字符向量,但现在这些行以换行符(\n)分隔,可用于达到目标:

length(jsonlite::toJSON(mtcars, pretty = TRUE))
## [1] 1
as.character(jsonlite::toJSON(mtcars, pretty = TRUE))
## [1] "[\n  {\n    \"mpg\": 21,\n    \"cyl\": 6,\n    \"disp\": 160,\n    \"hp\": 110,\n    \"drat\": 3.9,\n    \"wt\": 2.62,\n    \"qsec\": 16.46,\n    \"vs\": 0,\n    \"am\": 1,\n    \"gear\": 4,\n    \"carb\": 4,\n    \"_row\": \"Mazda RX4\"\n  },\n  {\n    \"mpg\": 21,\n    \"cyl\": 6,\n    \"disp\": 160,\n    \"hp\": 110,\n    \"drat\": 3.9,\n    \"wt\": 2.875,\n    \"qsec\": 17.02,\n    \"vs\": 0,\n    \"am\": 1,\n    \"gear\": 4,\n    \"carb\": 4,\n    \"_row\": \"Mazda RX4 Wag\"\n  },\n  {\n    \"mpg\": 22.8,\n    \"cyl\": 4,\n    \"disp\": 108,\n    \"hp\": 93,\n    \"drat\": 3.85,\n    \"wt\": 2.32,\n    \"qsec\": 18.61,\n    \"vs\": 1,\n    \"am\": 1,\n    \"gear\": 4,\n    \"carb\": 1,\n    \"_row\": \"Datsun 710\"\n  },\n  {\n    \"mpg\": 21.4,\n    \"cyl\": 6,\n    \"disp\": 258,\n    \"hp\": 110,\n    \"drat\": 3.08,\n    \"wt\": 3.215,\n    \"qsec\": 19.44,\n    \"vs\": 1,\n    \"am\": 0,\n    \"gear\": 3,\n    \"carb\": 1,\n    \"_row\": \"Hornet 4 Drive\"\n  },\n  {\n  ... <truncated>

仅打印JSON

中的一部分行

我编写了一个小函数,它将JSON对象作为输入并打印其行的子集。只有在创建JSON对象时使用pretty = TRUE,它才有效。这是:

print_json_lines <- function(json, lines) {

  # break up into lines
  json_lines <- strsplit(json, "\n")[[1]]

  # get desired lines
  json_lines <- json_lines[lines]

  # print
  cat(paste(json_lines, collapse = "\n"))

  # return invisily
  invisible(json_lines)

}

它使用strsplit()将行拆分为字符向量,每行一个条目。然后可以通过[]的正常索引来选择行。由于简单地打印字符向量可能会在一行上打印多个字符串,因此我将行的子集再次合并为一个字符串(使用paste())并将这些行与\n分开。这导致格式良好的输出:

print_json_lines(jsonlite::toJSON(mtcars, pretty = TRUE), 1:5)
## [
##   {
##     "mpg": 21,
##     "cyl": 6,
##     "disp": 160,

备注不同的JSON包

正如我在开头提到的,此解决方案适用于jsonliteRJSONIO。原因很简单,它们都允许您将JSON字符串拆分为pretty = TRUE行。但是,使用RJSONIO时输出看起来不同,因为它在转换中使用不同的约定:

print_json_lines(RJSONIO::toJSON(mtcars, pretty = TRUE), 1:5)
## {
##     "mpg" : [
##             21,
##             21,
##             22.8,

该函数不适用于rjson,因为我无法将JSON对象拆分成行。