我有一个R数据帧并使用jsonlite包将其转换为json格式
jsonData <- toJSON(dataset)
我想验证转换是否成功,数据集非常大。如何在json文件中打印前5行?
答案 0 :(得分:1)
有多个软件包实现了toJSON()
功能,例如jsonlite
,rjson
和RJSONIO
。我将在下面使用jsonlite
。该解决方案也适用于RJSONIO
,但不适用于rjson
。
我还没有找到直接从JSON字符串中打印一行子集的方法。原因是所有三个包都返回一个字符(即长度为1的字符向量)而不是字符向量,其中JSON字符串的每一行都占用一个元素:
length(jsonlite::toJSON(mtcars))
实际上,转换的对象是一长串文本:
jsonlite::toJSON(mtcars)
## [{"mpg":21,"cyl":6,"disp":160,"hp":110,"drat":3.9,"wt":2.62,"qsec":16.46,"vs":0,"am":1,"gear":4,"carb":4,"_row":"Mazda RX4"},{"mpg":21,"cyl":6,"disp":160,"hp":110,"drat":3.9,"wt":2.875,"qsec":17.02,"vs":0,"am":1,"gear":4,"carb":4,"_row":"Mazda RX4 Wag"},{"mpg":22.8,"cyl":4,"disp":108,"hp":93,"drat":3.85,"wt":2.32,"qsec":18.61,"vs":1,"am":1,"gear":4,"carb":1,"_row":"Datsun 710"},{"mpg":21.4,"cyl":6,"disp":258,"hp":110,"drat":3.08,"wt":3.215,"qsec":19.44,"vs":1,"am":0,"gear":3,"carb":1,"_row":"Hornet 4 Drive"},{"mpg":18.7,"cyl":8,"disp":360,"hp":175,"drat":3.15,"wt":3.44,"qsec":17.02,"vs":0,"am":0,"gear":3,"carb":2,"_row":"Hornet Sportabout"},{"mpg":18.1,"cyl":6,"disp":225,"hp":105,"drat":2.76,"wt":3.46,"qsec":20.22,"vs":1,"am":0,"gear":3,"carb":1,"_row":"Valiant"},{"mpg":14.3,"cyl":8,"disp":360,"hp":245,"drat":3.21,"wt":3.57,"qsec":15.84,"vs":0,"am":0,"gear":3,"carb":4,"_row":"Duster 360"},{"mpg":24.4,"cyl":4,"disp":146.7,"hp":62,"drat":3.69,"wt":3.19,"qsec":20,"vs":1,"am":0,"gear":4,"c... <truncated>
由于只有一行,你只需打印前几行即可获得任何收益。
但是来自toJSON
的函数jsonlite
(以及来自RJSONIO
的函数)允许您将JSON字符串分解为行(我手动截断输出,因为它也是长):
jsonlite::toJSON(mtcars, pretty = TRUE)
## [
## {
## "mpg": 21,
## "cyl": 6,
## "disp": 160,
## "hp": 110,
## "drat": 3.9,
## "wt": 2.62,
## ...
它仍然是长度为1的字符向量,但现在这些行以换行符(\n
)分隔,可用于达到目标:
length(jsonlite::toJSON(mtcars, pretty = TRUE))
## [1] 1
as.character(jsonlite::toJSON(mtcars, pretty = TRUE))
## [1] "[\n {\n \"mpg\": 21,\n \"cyl\": 6,\n \"disp\": 160,\n \"hp\": 110,\n \"drat\": 3.9,\n \"wt\": 2.62,\n \"qsec\": 16.46,\n \"vs\": 0,\n \"am\": 1,\n \"gear\": 4,\n \"carb\": 4,\n \"_row\": \"Mazda RX4\"\n },\n {\n \"mpg\": 21,\n \"cyl\": 6,\n \"disp\": 160,\n \"hp\": 110,\n \"drat\": 3.9,\n \"wt\": 2.875,\n \"qsec\": 17.02,\n \"vs\": 0,\n \"am\": 1,\n \"gear\": 4,\n \"carb\": 4,\n \"_row\": \"Mazda RX4 Wag\"\n },\n {\n \"mpg\": 22.8,\n \"cyl\": 4,\n \"disp\": 108,\n \"hp\": 93,\n \"drat\": 3.85,\n \"wt\": 2.32,\n \"qsec\": 18.61,\n \"vs\": 1,\n \"am\": 1,\n \"gear\": 4,\n \"carb\": 1,\n \"_row\": \"Datsun 710\"\n },\n {\n \"mpg\": 21.4,\n \"cyl\": 6,\n \"disp\": 258,\n \"hp\": 110,\n \"drat\": 3.08,\n \"wt\": 3.215,\n \"qsec\": 19.44,\n \"vs\": 1,\n \"am\": 0,\n \"gear\": 3,\n \"carb\": 1,\n \"_row\": \"Hornet 4 Drive\"\n },\n {\n ... <truncated>
我编写了一个小函数,它将JSON对象作为输入并打印其行的子集。只有在创建JSON对象时使用pretty = TRUE
,它才有效。这是:
print_json_lines <- function(json, lines) {
# break up into lines
json_lines <- strsplit(json, "\n")[[1]]
# get desired lines
json_lines <- json_lines[lines]
# print
cat(paste(json_lines, collapse = "\n"))
# return invisily
invisible(json_lines)
}
它使用strsplit()
将行拆分为字符向量,每行一个条目。然后可以通过[]
的正常索引来选择行。由于简单地打印字符向量可能会在一行上打印多个字符串,因此我将行的子集再次合并为一个字符串(使用paste()
)并将这些行与\n
分开。这导致格式良好的输出:
print_json_lines(jsonlite::toJSON(mtcars, pretty = TRUE), 1:5)
## [
## {
## "mpg": 21,
## "cyl": 6,
## "disp": 160,
正如我在开头提到的,此解决方案适用于jsonlite
和RJSONIO
。原因很简单,它们都允许您将JSON字符串拆分为pretty = TRUE
行。但是,使用RJSONIO
时输出看起来不同,因为它在转换中使用不同的约定:
print_json_lines(RJSONIO::toJSON(mtcars, pretty = TRUE), 1:5)
## {
## "mpg" : [
## 21,
## 21,
## 22.8,
该函数不适用于rjson
,因为我无法将JSON对象拆分成行。