如何将此API调用解析为(在R中)为.txt表格式? (与以色列的“开放政府”有关:))

时间:2011-03-18 12:21:24

标签: parsing r

以色列已经公布了所有人的预算,并且有一个提取数据的API。但是,我不知道如何将其解析成txt / csv格式。

这是example link to make a call for data

这是输出:

[
    {
        "parent": [
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 6075053, 
        "year": 2003, 
        "title": "השכלה גבוהה", 
        "gross_amount_used": 5942975, 
        "gross_amount_revised": 5942975, 
        "budget_id": "0021", 
        "net_amount_used": 5936491, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 5861591, 
        "gross_amount_allocated": 5861591
    }, 
    {
        "parent": [
            {
                "budget_id": "0021", 
                "title": "השכלה גבוהה"
            }, 
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 5364976, 
        "year": 2003, 
        "title": "השתתפות בתקציב המוסדות להשכלה גבוהה", 
        "gross_amount_used": 5337585, 
        "gross_amount_revised": 5337584, 
        "budget_id": "002102", 
        "net_amount_used": 5331101, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 4985915, 
        "gross_amount_allocated": 4985915
    }, 
    {
        "parent": [
            {
                "budget_id": "0021", 
                "title": "השכלה גבוהה"
            }, 
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 565495, 
        "year": 2003, 
        "title": "השתתפות בפעולות", 
        "gross_amount_used": 462490, 
        "gross_amount_revised": 462490, 
        "budget_id": "002103", 
        "net_amount_used": 462490, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 559293, 
        "gross_amount_allocated": 559293
    }, 
    {
        "parent": [
            {
                "budget_id": "0021", 
                "title": "השכלה גבוהה"
            }, 
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 0, 
        "year": 2003, 
        "title": "רזרבה להתייקרויות", 
        "gross_amount_used": 0, 
        "gross_amount_revised": null, 
        "budget_id": "002105", 
        "net_amount_used": null, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 171801, 
        "gross_amount_allocated": 171801
    }, 
    {
        "parent": [
            {
                "budget_id": "0021", 
                "title": "השכלה גבוהה"
            }, 
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 108000, 
        "year": 2003, 
        "title": "פיתוח מוסדות להשכלה    גבוהה", 
        "gross_amount_used": 108000, 
        "gross_amount_revised": 108000, 
        "budget_id": "002106", 
        "net_amount_used": 108000, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 108000, 
        "gross_amount_allocated": 108000
    }, 
    {
        "parent": [
            {
                "budget_id": "0021", 
                "title": "השכלה גבוהה"
            }, 
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 23634, 
        "year": 2003, 
        "title": "תחום פעולה כללי", 
        "gross_amount_used": 23634, 
        "gross_amount_revised": 23634, 
        "budget_id": "002101", 
        "net_amount_used": 23634, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 23634, 
        "gross_amount_allocated": 23634
    }, 
    {
        "parent": [
            {
                "budget_id": "0021", 
                "title": "השכלה גבוהה"
            }, 
            {
                "budget_id": "00", 
                "title": "המדינה"
            }
        ], 
        "net_amount_revised": 12948, 
        "year": 2003, 
        "title": "פעולות עם משרדים       ומוסדות אחרים", 
        "gross_amount_used": 11266, 
        "gross_amount_revised": 11266, 
        "budget_id": "002104", 
        "net_amount_used": 11266, 
        "inflation_factor": 1.15866084989269, 
        "net_amount_allocated": 12948, 
        "gross_amount_allocated": 12948
    }
]

将此解析为表格格式的方法是什么?

谢谢!

塔尔

4 个答案:

答案 0 :(得分:3)

如果您安装the rjson package,您应该能够:

do.call( 'rbind', fromJSON( file="http://budget.yeda.us/0021?year=2003&depth=1" ) )

[编辑]

实际上..那个可变长度的内部parent列表存在问题,但应该让你到一半

答案 1 :(得分:2)

是的,这是JSON。 fromJSON会将其转换为适合您的列表

resp <- getURL("http://budget.yeda.us/0021?year=2003&depth=1")
library(rjson)
resp <- fromJSON(resp)

这会让你列出表格。对于数据框,请尝试:

library(plyr)
resp <- llply(resp, function(x) llply(x, function(y) ifelse(is.null(y), "NULL", y)))
budget <- data.frame()
for(i in 1:length(resp)) {
  budget <- rbind.fill(budget, data.frame(resp[[i]]))
}

嵌套llply在创建包含空值的数据框时会处理一些不愉快。

答案 2 :(得分:1)

看起来像JSON。尝试rjson包,但可能需要一些循环或棘手的listy fiddling。

现在是午餐时间,否则我会有一个粘贴的解决方案。给蜂巢头部的非午餐部分几分钟......

答案 3 :(得分:1)

这与之前的答案类似,但也会为budget_id中的第二个titleparent字段创建列,而不仅仅是第一个,并且在{{{{}}上的结构略有不同1}}和parent分开,然后将它们重新组合在一起。

rest