将JSON数据拉入R Dataframe

时间:2018-09-03 03:23:54

标签: r json jsonlite

我正在尝试从USNO API中提取JSON月球周期数据。问题是我得到的数据中包含两个JSON数据数组。我看不到指定从天文台返回的内容的方法,因此我认为我需要在R中清理它。这是我的代码:

library(sqldf);
library(jsonlite);

curr_date <- Sys.Date();
Q_date <- format.Date(curr_date, "%m/%d/%Y");
moon_call <- paste0("http://api.usno.navy.mil/moon/phase?date=",Q_date,"&nump=4");

moon_json <- fromJSON(moon_call, simplifyDataFrame =  TRUE);

moon_phases <- do.call("rbind.fill", lapply(moon_json$phasedata, as.data.frame));

我得到的数据如下:

"","error","apiversion","year","month","day","numphases","datechanged","phasedata.phase","phasedata.date","phasedata.time"
"1",FALSE,"2.1.0",2018,8,29,4,FALSE,"Last Quarter","2018 Sep 03","02:37"
"2",FALSE,"2.1.0",2018,8,29,4,FALSE,"New Moon","2018 Sep 09","18:01"
"3",FALSE,"2.1.0",2018,8,29,4,FALSE,"First Quarter","2018 Sep 16","23:15"
"4",FALSE,"2.1.0",2018,8,29,4,FALSE,"Full Moon","2018 Sep 25","02:52"

将其转换为数据框时,我得到以下信息:

"","X[[i]]"
"1","Last Quarter"
"2","New Moon"
"3","First Quarter"
"4","Full Moon"
"5","2018 Sep 03"
"6","2018 Sep 09"
"7","2018 Sep 16"
"8","2018 Sep 25"
"9","02:37"
"10","18:01"
"11","23:15"
"12","02:52"

但是我想要的是一个选中了phasedata.phase/.date/.time列的数据框:

"","phase","date","time"
"1","Last Quarter","2018 Sep 03","02:37"
"2","New Moon","2018 Sep 09","18:01"
"3","First Quarter","2018 Sep 16","23:15"
"4","Full Moon","2018 Sep 25","02:52"

1 个答案:

答案 0 :(得分:0)

  1. R允许您直接从数据帧moon_jsonextract the three columns,就像您想要的那样:

    moon_phases <- moon_json[, c('phasedata.phase', 'phasedata.date', 'phasedata.time')]

do.call("rbind.fill", lapply(..., as.data.frame))不需要任何东西,这只是切片和级联的一种低效且受折磨的方式。)

  1. 然后,您要重命名df列,以放置phasedata.前缀:

    names(moon_phases) <- c('phase', 'date', 'time')

或:names(moon_phases) <- gsub('^phasedata\.', '', names(moon_phases))

  1. 此外,您通常不希望像1,2,3...这样的数据框上使用显式行名moon_json,所以只需要row.names(moon_json) <- NULLdata.frame(..., row.names=NULL)as.data.frame(..., row.names=NULL)

({jsonlite(或其他R json软件包之一)应该具有执行此清理和自动重命名的选项,我不知道,我使用的不是很多,请检查并选择可以减轻刮花痛苦的包装。)