Groovy json解析器

时间:2015-12-07 02:09:11

标签: json groovy

我正在尝试将以下json解析为csv文件。我有一个名单,我正试图从这个网站获得他们潜在的种族:http://www.textmap.com/ethnicity/。我使用过统计处理语言(SAS,Stata ......),但我不熟悉面向对象语言。任何帮助将不胜感激。

这就是我所做的:

import groovy.json.*

def jsonSlurper = new JsonSlurper()
def result = jsonSlurper.parseText '''
{
    "George Washington": [
        { "scores":
            [ {"score": "0.07", "ethnicity": "Asian"},
              {"score": "0.00", "ethnicity": "GreaterAfrican"},
              {"score": "0.93", "ethnicity": "GreaterEuropean"}],
          "best":"GreaterEuropean" },
        { "scores":
            [ {"score": "1.00", "ethnicity": "British"},
              {"score": "0.00", "ethnicity": "Jewish"}, 
              {"score": "0.00", "ethnicity": "WestEuropean"}, 
              {"score": "0.00", "ethnicity": "EastEuropean"}],
          "best":"British" }
    ],
    "John Smith": [
        { "scores":
            [ {"score": "0.00", "ethnicity": "Asian"},
              {"score": "0.00", "ethnicity": "GreaterAfrican"},
              {"score": "1.00", "ethnicity": "GreaterEuropean"}],
          "best":"GreaterEuropean" },
        { "scores":
            [ {"score": "1.00", "ethnicity": "British"},
              {"score": "0.00", "ethnicity": "Jewish"},
              {"score": "0.00", "ethnicity": "WestEuropean"},
              {"score": "0.00", "ethnicity": "EastEuropean"}],
          "best": "British" }
    ],
    "Barack Obama": [
        { "scores":
            [ {"score": "0.00", "ethnicity": "Asian"},
              {"score": "1.00", "ethnicity": "GreaterAfrican"},
              {"score": "0.00", "ethnicity": "GreaterEuropean"}],
          "best":"GreaterAfrican"},
        {"scores":
            [ {"score": "1.00", "ethnicity": "Africans"},
              {"score": "0.00", "ethnicity": "Muslim"}],
          "best":"Africans"}
    ]
}    
'''

String[] header = new String[1];
      header[0] = result["George Washington"].best[1];


result.each {entry,value ->
println "Name: $entry Eth: $value.best" 
}

我的问题是: 1.我不知道如何将名称(乔治华盛顿等)放入一个单独的字符串组件中,比如标题的1个元素。因此,我无法将数据导出到csv文件。

  1. 我不确定如何获取对象中元素的特定组件(请原谅我糟糕的描述)。例如," best"对象中的元素可以采用两个值。我能够在字符串定义中获得第一个,比如说[1],但我不知道如何在循环命令中引用它。

  2. 我也有一个关于面向对象语言的一般性问题。似乎一个对象可能包含很多元素。如何计算对象中有多少元素?

  3. 提前致谢!

1 个答案:

答案 0 :(得分:1)

面向对象编程是一个广泛的主题,但是这里是如何生成CSV数据的。除了 best 之外,因为我不知道如何合并它。

def data = result.collect { name, map ->
    def output = [name: name]

    map.scores.flatten().each { output[it.ethnicity] = it.score }

    return output
}

def ethnicities = result.collect { name, map -> map.scores.ethnicity }.flatten().unique().toSorted()
def records = data.collect { person -> [person.name, ethnicities.collect { ethnicity -> person[ethnicity] ?: 0 }].flatten() }

def csv = records.inject(new StringBuilder("name,${ethnicities.join(',')}\n")) { builder, it -> 
    builder.append it.join(',') 
    builder.append "\n"

    return builder
}

dataresult的转换,如下所示:

[
    ['name':'Barack Obama', 'Asian':'0.00', 'GreaterAfrican':'1.00', 'GreaterEuropean':'0.00', 'Africans':'1.00', 'Muslim':'0.00'], 
    ['name':'George Washington', 'Asian':'0.07', 'GreaterAfrican':'0.00', 'GreaterEuropean':'0.93', 'British':'1.00', 'Jewish':'0.00', 'WestEuropean':'0.00', 'EastEuropean':'0.00'], 
    ['name':'John Smith', 'Asian':'0.00', 'GreaterAfrican':'0.00', 'GreaterEuropean':'1.00', 'British':'1.00', 'Jewish':'0.00', 'WestEuropean':'0.00', 'EastEuropean':'0.00']
] 

ethnicities是JSON数据中所有种族的唯一列表。 records是一个包含要以CSV格式写入的数据的列表。它为任何缺失的种族分数增加零值。它看起来像这样:

[
    ['Barack Obama', '1.00', '0.00', 0, 0, '1.00', '0.00', 0, '0.00', 0],     
    ['George Washington', 0, '0.07', '1.00', '0.00', '0.00', '0.93', '0.00', 0, '0.00'], 
    ['John Smith', 0, '0.00', '1.00', '0.00', '0.00', '1.00', '0.00', 0, '0.00']
]

最后,输出如下:

name,Africans,Asian,British,EastEuropean,GreaterAfrican,GreaterEuropean,Jewish,Muslim,WestEuropean
Barack Obama,1.00,0.00,0,0,1.00,0.00,0,0.00,0
George Washington,0,0.07,1.00,0.00,0.00,0.93,0.00,0,0.00
John Smith,0,0.00,1.00,0.00,0.00,1.00,0.00,0,0.00

提示

  1. resultMap,名称是键。所以你可以得到这样的名字:def name = result.keySet()[0]
  2. result对象支持Groovy的GPath