将JSON转换为R中的data.frame

时间:2016-02-23 12:43:41

标签: json r dataframe

我将列表转换为data.frame

时遇到问题

首先,我从Data API下载了JSON格式的数据集:

request1 <- POST(url = "https://api.data-api.io/v1/subjekti", add_headers('x-dataapi-key' = "xxxxxxx", 'content-type'= "application/json"), body = list(oib = oibreq), encode = "json")
json1 <- content(request1, type = "application/json")
json2 <- fromJSON(toJSON(json1, null = "null"), flatten = TRUE)

问题是数据是列表的元素。例如

> json2[['oib']]
[[1]]
[1] "00045103869"

[[2]]
[1] "18527887472"

[[3]]
[1] "92680516748"

所有姓氏:

> colnames(json2)
 [1] "oib"               "mb"                "mbs"               "mbo"               "rno"               "naziv"            
 [7] "adresa"            "grad"              "posta"             "zupanija"          "nkd2007"           "puo"              
[13] "godinaOsnivanja"   "status"            "temeljniKapital"   "isActive"          "datumBrisanja"     "predmetPoslovanja"

如何将此列表转换为data.frame?

抱歉,这是我关于stockoverflow的第一个问题。有我的数据集:

> data <- dput(json3)
structure(list(oib = list("00045103869", "18527887472", "92680516748"), 
    mb = list("01699032", "03858731", "02591596"), mbs = list(
        "080451345", "060060881", "040260786"), mbo = c(NA, NA, 
    NA), rno = c(NA, NA, NA), naziv = list("INTERIJER DIZAJN d.o.o.", 
        "M - Đ COMMERCE d.o.o.", "HIP REKLAME d.o.o. u stečaju"), 
    adresa = list("Savska cesta 179", "Put Piketa 0", "Sadska 2"), 
    grad = list("Zagreb", "Sinj", "Rijeka"), posta = list("10000", 
        "21230", "51000"), zupanija = list("Grad Zagreb", "Splitsko-dalmatinska", 
        "Primorsko-goranska"), nkd2007 = list("1623", "4719", 
        "4711"), puo = list(92L, 92L, 92L), godinaOsnivanja = list(
        "2003", "1995", "2009"), status = list("bez postupka", 
        "bez postupka", "stečaj"), temeljniKapital = list("20.000,00 kn", 
        "509.100,00 kn", "20.000,00 kn"), isActive = list(TRUE, 
        TRUE, FALSE), datumBrisanja = list(NULL, NULL, "2015-12-24T00:00:00+01:00")), .Names = c("oib", 
"mb", "mbs", "mbo", "rno", "naziv", "adresa", "grad", "posta", 
"zupanija", "nkd2007", "puo", "godinaOsnivanja", "status", "temeljniKapital", 
"isActive", "datumBrisanja"), class = "data.frame", row.names = c(NA, 
3L))

1 个答案:

答案 0 :(得分:0)

快速&amp;脏的方法是用例如NULL值替换NA值。 f <- function(lst) lapply(lst, function(x) if (is.list(x)) f(x) else if (is.null(x)) NA_character_ else x) df <- as.data.frame(lapply(f(json2), unlist)) str(df) # 'data.frame': 3 obs. of 17 variables: # $ oib : Factor w/ 3 levels "00045103869",..: 1 2 3 # $ mb : Factor w/ 3 levels "01699032","02591596",..: 1 3 2 # $ mbs : Factor w/ 3 levels "040260786","060060881",..: 3 2 1 # $ mbo : logi NA NA NA # $ rno : logi NA NA NA # $ naziv : Factor w/ 3 levels "HIP REKLAME d.o.o. u stecaju",..: 2 3 1 # $ adresa : Factor w/ 3 levels "Put Piketa 0",..: 3 1 2 # $ grad : Factor w/ 3 levels "Rijeka","Sinj",..: 3 2 1 # $ posta : Factor w/ 3 levels "10000","21230",..: 1 2 3 # $ zupanija : Factor w/ 3 levels "Grad Zagreb",..: 1 3 2 # $ nkd2007 : Factor w/ 3 levels "1623","4711",..: 1 3 2 # $ puo : int 92 92 92 # $ godinaOsnivanja: Factor w/ 3 levels "1995","2003",..: 2 1 3 # $ status : Factor w/ 2 levels "bez postupka",..: 1 1 2 # $ temeljniKapital: Factor w/ 2 levels "20.000,00 kn",..: 1 2 1 # $ isActive : logi TRUE TRUE FALSE # $ datumBrisanja : Factor w/ 1 level "2015-12-24T00:00:00+01:00": NA NA 1 就像这样

{{1}}

但可能有更好的选择。