R data.frame到具有子节点/层次结构的JSON

时间:2015-02-26 13:26:34

标签: json r hierarchical rjsonio

我正在尝试将R中的data.frame写入JSON文件,但是在其中包含子节点的分层结构中。我找到了示例和JSONIO,但我无法将其应用于我的案例。

这是R

中的data.frame
> DF
   Date_by_Month    CCG Year Month refYear      name OC_5a OC_5b OC_5c 
1     2010-01-01 MyTown 2010    01    2009 2009/2010     0    15    27 
2     2010-02-01 MyTown 2010    02    2009 2009/2010     1    14    22 
3     2010-03-01 MyTown 2010    03    2009 2009/2010     1     6    10 
4     2010-04-01 MyTown 2010    04    2010 2010/2011     0    10    10 
5     2010-05-01 MyTown 2010    05    2010 2010/2011     1    16     7 
6     2010-06-01 MyTown 2010    06    2010 2010/2011     0    13    25 

除了按月编写数据之外,我还想创建一个聚合子项,即“年度”子项,其中包含今年所有月份的总和(例如)。这就是我希望JSON文件的样子:

[
    {
     "ccg":"MyTown",
     "data":[
            {"period":"yearly",
             "scores":[
                {"name":"2009/2010","refYear":"2009","OC_5a":2, "OC_5b": 35, "OC_5c": 59},
                {"name":"2010/2011","refYear":"2010","OC_5a":1, "OC_5b": 39, "OC_5c": 42},
             ]
             },
            {"period":"monthly",
             "scores":[
                {"name":"2009/2010","refYear":"2009","month":"01","year":"2010","OC_5a":0, "OC_5b": 15, "OC_5c": 27},
                {"name":"2009/2010","refYear":"2009","month":"02","year":"2010","OC_5a":1, "OC_5b": 14, "OC_5c": 22},
                {"name":"2009/2010","refYear":"2009","month":"03","year":"2010","OC_5a":1, "OC_5b": 6, "OC_5c": 10},
                {"name":"2009/2010","refYear":"2009","month":"04","year":"2010","OC_5a":0, "OC_5b": 10, "OC_5c": 10},
                {"name":"2009/2010","refYear":"2009","month":"05","year":"2010","OC_5a":1, "OC_5b": 16, "OC_5c": 7},
                {"name":"2009/2010","refYear":"2009","month":"01","year":"2010","OC_5a":0, "OC_5b": 13, "OC_5c": 25}
                ]
             }
            ]
    },
]

非常感谢你的帮助!

2 个答案:

答案 0 :(得分:2)

扩展我的评论:

  

jsonlite包具有很多功能,但是你所描述的内容并没有真正映射到数据框,所以我怀疑任何固定例程都有这个功能。您最好的选择可能是将数据帧转换为更通用的列表(FYI数据帧在内部存储为列列表),其结构与JSON的结构完全匹配,然后只使用转换器进行转换

这一般很复杂,但在你的情况下应该相当简单。该列表的结构与JSON数据完全相同:

list(
  list(
    ccg = "Town1",
    data = list(
      list(
        period = "yearly",
        scores = yearly_data_frame_town1
      ),
      list(
        period = "monthly",
        scores = monthly_data_frame_town1
      )
    )
  ),
  list(
    ccg = "Town2",
    data = list(
      list(
        period = "yearly",
        scores = yearly_data_frame_town2
      ),
      list(
        period = "monthly",
        scores = monthly_data_frame_town2
      )
    )
  )
)

构建此列表应该是一个简单的循环unique(DF$CCG)并在每一步使用aggregate的情况,以构建年度数据。

如果您需要性能,请查看data.tabledplyr个软件包以进行循环并一次性汇总。前者既灵活又高效,但有点深奥。后者具有相对简单的语法并且具有相似的性能,但是专门围绕构建数据帧的管道而设计,因此可能需要一些黑客才能使其生成正确的输出格式。

答案 1 :(得分:2)

看起来ssdecontrol让你了解......但这是我的解决方案。需要循环使用独特的CCG和Years来创建整个数据集...

df <- read.table(textConnection("Date_by_Month    CCG Year Month refYear      name OC_5a OC_5b OC_5c 
2010-01-01 MyTown 2010    01    2009 2009/2010     0    15    27 
2010-02-01 MyTown 2010    02    2009 2009/2010     1    14    22 
2010-03-01 MyTown 2010    03    2009 2009/2010     1     6    10 
2010-04-01 MyTown 2010    04    2010 2010/2011     0    10    10 
2010-05-01 MyTown 2010    05    2010 2010/2011     1    16     7 
2010-06-01 MyTown 2010    06    2010 2010/2011     0    13    25"), stringsAsFactors=F, header=T)


library(RJSONIO)
to_list <- function(ccg, year){
  df_monthly <- subset(df, CCG==ccg & Year==year)
  df_yearly <- aggregate(df[,c("OC_5a", "OC_5b", "OC_5c")] ,df[,c("name", "refYear")], sum)
  l <- list("ccg"=ccg, 
            data=list(list("period" = "yearly",
                      "scores" = as.list(df_yearly)
                      ),
                      list("period" = "monthly",
                           "scores" = as.list(df[,c("name", "refYear", "OC_5a", "OC_5b", "OC_5c")])
                      )
            )
       )
  return(l)
}
toJSON(to_list("MyTown", "2010"), pretty=T)

返回此内容:

{
    "ccg" : "MyTown",
    "data" : [
        {
            "period" : "yearly",
            "scores" : {
                "name" : [
                    "2009/2010",
                    "2010/2011"
                ],
                "refYear" : [
                    2009,
                    2010
                ],
                "OC_5a" : [
                    2,
                    1
                ],
                "OC_5b" : [
                    35,
                    39
                ],
                "OC_5c" : [
                    59,
                    42
                ]
            }
        },
        {
            "period" : "monthly",
            "scores" : {
                "name" : [
                    "2009/2010",
                    "2009/2010",
                    "2009/2010",
                    "2010/2011",
                    "2010/2011",
                    "2010/2011"
                ],
                "refYear" : [
                    2009,
                    2009,
                    2009,
                    2010,
                    2010,
                    2010
                ],
                "OC_5a" : [
                    0,
                    1,
                    1,
                    0,
                    1,
                    0
                ],
                "OC_5b" : [
                    15,
                    14,
                    6,
                    10,
                    16,
                    13
                ],
                "OC_5c" : [
                    27,
                    22,
                    10,
                    10,
                    7,
                    25
                ]
            }
        }
    ]
}