从json(r语言)获取data.frame

时间:2016-03-14 12:35:46

标签: r

我的json文件包含400个元素。我想把它转换成data.frame(需要它进行分析)。

例如我有json文件:

"user_id":"severalHint","device_id": "dfg4644fsnthj"
"user_id":"berrrrka","session_time": "2314"

我想把它变成这样的数据表:

   user_id      device_id     session_time
1  severalHint  dfg4644fsnthj NA
2  berrrrka     NA            2314

我把我的JSON转到列表这里是输入:

list(structure(list(data = structure(list(ios_idfv = "57BE266B-CA71-4C53-9F28-4CE0F6FFDB55", 
    engine_version = "unity 4.6.9", category = "resource", os_version = "ios 9.2", 
    ios_idfa = "256A9626-DD07-40C2-BB3F-862F26D89DEF", amount = 100, 
    v = 2, sdk_version = "unity 2.4.3", user_id = "256A9626-DD07-40C2-BB3F-862F26D89DEF", 
    session_num = 2, platform = "ios", connection_type = "wifi", 
    manufacturer = "apple", client_ts = 1457359485, session_id = "a4671aaa-bd13-4655-a42f-c333a9709c58", 
    device = "iPad4,5", event_id = "Source:Credits:Reward:DailyReward", 
    build = "1.0"), .Names = c("ios_idfv", "engine_version", 
"category", "os_version", "ios_idfa", "amount", "v", "sdk_version", 
"user_id", "session_num", "platform", "connection_type", "manufacturer", 
"client_ts", "session_id", "device", "event_id", "build")), first_in_batch = TRUE, 
    country_code = "RU", arrival_ts = 1457359488, game_id = 24540, 
    ip = "95.107.103.0", user_meta = structure(list(install_ad = "80961452183", 
        install_ad = "80961452183", install_ad = "80961452183", 
        install_ad = "80961452183", install_campaign = "wizzo", 
        install_campaign = "wizzo", install_campaign = "wizzo", 
        install_campaign = "wizzo", install_publisher = "d", 
        install_publisher = "d", install_publisher = "d", install_publisher = "d", 
        install_site = "Heroes and Castles 2 Free", install_site = "Heroes and Castles 2 Free", 
        install_site = "Heroes and Castles 2 Free", install_site = "Heroes and Castles 2 Free", 
        install_ts = 1457287997, revenue = list(), cohort_week = 1456704000, 
        cohort_month = 1456790400), .Names = c("install_ad", 
    "install_ad", "install_ad", "install_ad", "install_campaign", 
    "install_campaign", "install_campaign", "install_campaign", 
    "install_publisher", "install_publisher", "install_publisher", 
    "install_publisher", "install_site", "install_site", "install_site", 
    "install_site", "install_ts", "revenue", "cohort_week", "cohort_month"
    ))), .Names = c("data", "first_in_batch", "country_code", 
"arrival_ts", "game_id", "ip", "user_meta")))

1 个答案:

答案 0 :(得分:2)

好的,让我们从你的json示例开始:

js = '[
          {"user_id":"severalHint","device_id": "dfg4644fsnthj"},
          {"user_id":"berrrrka","session_time": "2314"}
      ]'

这可以简单地使用rjson包转换为列表:

require(rjson)
js.list = fromJSON(js)

此外,您可以在此处找到一个讨论将列表转换为数据框的好线程:Convert R list to dataframe with missing/NULL elements

在我们的例子中,代码是

library(plyr)

rbind.fill(lapply(js.list, function(f) {
    as.data.frame(Filter(Negate(is.null), f))
}))

导致输出

      user_id     device_id session_time
1 severalHint dfg4644fsnthj         <NA>
2    berrrrka          <NA>         2314

并且应该解决你的问题。