我有一个可以通过 JSON 格式的 URL 访问的对象,我想将其中的一部分放入数据帧格式以分析 R 中的数据。我目前这样做如下:
# Read data in using fromJSON function
data <- RJSONIO::fromJSON('https://api-prod.footballindex.co.uk/football.allTradable24hrchanges?page=1&per_page=5000&sort=asc')
# The above is an example dataset to use for this question
# In my actual dataset there are fields which only exist in some elements in list so adding this onto example
# I want to handle these by returning in the end dataframe with NA if it doesn't exist
data[['items']][[1]]$newField <- 1
# It is only the data in the items field I am interested in
# Unlist each element to get all nested emelents within the lists in flat format
dataList <- lapply(data[['items']], unlist)
# Combine all elemnts of list together
dataDF <- dplyr::bind_rows(dataList)
# Convert into data.frame
dataDF <- data.frame(dataDF)
这是可行的,但是 bind_rows
部分需要很长时间
> system.time(dataDF <- dplyr::bind_rows(dataList))
user system elapsed
42.195 0.000 42.216
感觉必须有一种更快的方法来做到这一点。
有人告诉我 data.table::rbindlist
是一个更快的选择,但使用它会给我错误信息
> dataDF <- data.table::rbindlist(dataList)
Error in data.table::rbindlist(dataList) :
Item 1 of input is not a data.frame, data.table or list
曾建议在一个运行速度快的答案中使用 do.call(rbind...
,但是当存在仅在某些元素中的字段时,它无法正确处理此问题。例如
dataDF2 <- data.frame(do.call(rbind, dataList))
> head(dataDF$country)
[1] "Côte d'Ivoire" "Italy" "England" "Scotland" "Germany" "France"
> head(dataDF2$country)
[1] "Côte d'Ivoire" "1.65" "1.62" "FALSE" "2.59" "France"
在此先感谢您的帮助
答案 0 :(得分:1)
data <- RJSONIO::fromJSON('https://api-prod.footballindex.co.uk/football.allTradable24hrchanges?page=1&per_page=5000&sort=asc')
system.time(dataDF <- as.data.frame(do.call(rbind, data[['items']])))
user system elapsed
0.007 0.000 0.006