提示

Question

我正在尝试将以下json解析为csv文件。我有一个名单，我正试图从这个网站获得他们潜在的种族：http://www.textmap.com/ethnicity/。我使用过统计处理语言（SAS，Stata ......），但我不熟悉面向对象语言。任何帮助将不胜感激。

这就是我所做的：

import groovy.json.*

def jsonSlurper = new JsonSlurper()
def result = jsonSlurper.parseText '''
{
    "George Washington": [
        { "scores":
            [ {"score": "0.07", "ethnicity": "Asian"},
              {"score": "0.00", "ethnicity": "GreaterAfrican"},
              {"score": "0.93", "ethnicity": "GreaterEuropean"}],
          "best":"GreaterEuropean" },
        { "scores":
            [ {"score": "1.00", "ethnicity": "British"},
              {"score": "0.00", "ethnicity": "Jewish"}, 
              {"score": "0.00", "ethnicity": "WestEuropean"}, 
              {"score": "0.00", "ethnicity": "EastEuropean"}],
          "best":"British" }
    ],
    "John Smith": [
        { "scores":
            [ {"score": "0.00", "ethnicity": "Asian"},
              {"score": "0.00", "ethnicity": "GreaterAfrican"},
              {"score": "1.00", "ethnicity": "GreaterEuropean"}],
          "best":"GreaterEuropean" },
        { "scores":
            [ {"score": "1.00", "ethnicity": "British"},
              {"score": "0.00", "ethnicity": "Jewish"},
              {"score": "0.00", "ethnicity": "WestEuropean"},
              {"score": "0.00", "ethnicity": "EastEuropean"}],
          "best": "British" }
    ],
    "Barack Obama": [
        { "scores":
            [ {"score": "0.00", "ethnicity": "Asian"},
              {"score": "1.00", "ethnicity": "GreaterAfrican"},
              {"score": "0.00", "ethnicity": "GreaterEuropean"}],
          "best":"GreaterAfrican"},
        {"scores":
            [ {"score": "1.00", "ethnicity": "Africans"},
              {"score": "0.00", "ethnicity": "Muslim"}],
          "best":"Africans"}
    ]
}    
'''

String[] header = new String[1];
      header[0] = result["George Washington"].best[1];


result.each {entry,value ->
println "Name: $entry Eth: $value.best" 
}

我的问题是： 1.我不知道如何将名称（乔治华盛顿等）放入一个单独的字符串组件中，比如标题的1个元素。因此，我无法将数据导出到csv文件。

我不确定如何获取对象中元素的特定组件（请原谅我糟糕的描述）。例如，＆＃34; best＆＃34;对象中的元素可以采用两个值。我能够在字符串定义中获得第一个，比如说[1]，但我不知道如何在循环命令中引用它。
我也有一个关于面向对象语言的一般性问题。似乎一个对象可能包含很多元素。如何计算对象中有多少元素？

提前致谢！

Answer 1

面向对象编程是一个广泛的主题，但是这里是如何生成CSV数据的。除了 best 之外，因为我不知道如何合并它。

def data = result.collect { name, map ->
    def output = [name: name]

    map.scores.flatten().each { output[it.ethnicity] = it.score }

    return output
}

def ethnicities = result.collect { name, map -> map.scores.ethnicity }.flatten().unique().toSorted()
def records = data.collect { person -> [person.name, ethnicities.collect { ethnicity -> person[ethnicity] ?: 0 }].flatten() }

def csv = records.inject(new StringBuilder("name,${ethnicities.join(',')}\n")) { builder, it -> 
    builder.append it.join(',') 
    builder.append "\n"

    return builder
}

data是result的转换，如下所示：

[
    ['name':'Barack Obama', 'Asian':'0.00', 'GreaterAfrican':'1.00', 'GreaterEuropean':'0.00', 'Africans':'1.00', 'Muslim':'0.00'], 
    ['name':'George Washington', 'Asian':'0.07', 'GreaterAfrican':'0.00', 'GreaterEuropean':'0.93', 'British':'1.00', 'Jewish':'0.00', 'WestEuropean':'0.00', 'EastEuropean':'0.00'], 
    ['name':'John Smith', 'Asian':'0.00', 'GreaterAfrican':'0.00', 'GreaterEuropean':'1.00', 'British':'1.00', 'Jewish':'0.00', 'WestEuropean':'0.00', 'EastEuropean':'0.00']
]

ethnicities是JSON数据中所有种族的唯一列表。 records是一个包含要以CSV格式写入的数据的列表。它为任何缺失的种族分数增加零值。它看起来像这样：

[
    ['Barack Obama', '1.00', '0.00', 0, 0, '1.00', '0.00', 0, '0.00', 0],     
    ['George Washington', 0, '0.07', '1.00', '0.00', '0.00', '0.93', '0.00', 0, '0.00'], 
    ['John Smith', 0, '0.00', '1.00', '0.00', '0.00', '1.00', '0.00', 0, '0.00']
]

最后，输出如下：

name,Africans,Asian,British,EastEuropean,GreaterAfrican,GreaterEuropean,Jewish,Muslim,WestEuropean
Barack Obama,1.00,0.00,0,0,1.00,0.00,0,0.00,0
George Washington,0,0.07,1.00,0.00,0.00,0.93,0.00,0,0.00
John Smith,0,0.00,1.00,0.00,0.00,1.00,0.00,0,0.00

提示

result是Map，名称是键。所以你可以得到这样的名字：def name = result.keySet()[0]
result对象支持Groovy的GPath。

Groovy json解析器

1 个答案:

提示