我正在尝试将以下json解析为csv文件。我有一个名单,我正试图从这个网站获得他们潜在的种族:http://www.textmap.com/ethnicity/。我使用过统计处理语言(SAS,Stata ......),但我不熟悉面向对象语言。任何帮助将不胜感激。
这就是我所做的:
import groovy.json.*
def jsonSlurper = new JsonSlurper()
def result = jsonSlurper.parseText '''
{
"George Washington": [
{ "scores":
[ {"score": "0.07", "ethnicity": "Asian"},
{"score": "0.00", "ethnicity": "GreaterAfrican"},
{"score": "0.93", "ethnicity": "GreaterEuropean"}],
"best":"GreaterEuropean" },
{ "scores":
[ {"score": "1.00", "ethnicity": "British"},
{"score": "0.00", "ethnicity": "Jewish"},
{"score": "0.00", "ethnicity": "WestEuropean"},
{"score": "0.00", "ethnicity": "EastEuropean"}],
"best":"British" }
],
"John Smith": [
{ "scores":
[ {"score": "0.00", "ethnicity": "Asian"},
{"score": "0.00", "ethnicity": "GreaterAfrican"},
{"score": "1.00", "ethnicity": "GreaterEuropean"}],
"best":"GreaterEuropean" },
{ "scores":
[ {"score": "1.00", "ethnicity": "British"},
{"score": "0.00", "ethnicity": "Jewish"},
{"score": "0.00", "ethnicity": "WestEuropean"},
{"score": "0.00", "ethnicity": "EastEuropean"}],
"best": "British" }
],
"Barack Obama": [
{ "scores":
[ {"score": "0.00", "ethnicity": "Asian"},
{"score": "1.00", "ethnicity": "GreaterAfrican"},
{"score": "0.00", "ethnicity": "GreaterEuropean"}],
"best":"GreaterAfrican"},
{"scores":
[ {"score": "1.00", "ethnicity": "Africans"},
{"score": "0.00", "ethnicity": "Muslim"}],
"best":"Africans"}
]
}
'''
String[] header = new String[1];
header[0] = result["George Washington"].best[1];
result.each {entry,value ->
println "Name: $entry Eth: $value.best"
}
我的问题是: 1.我不知道如何将名称(乔治华盛顿等)放入一个单独的字符串组件中,比如标题的1个元素。因此,我无法将数据导出到csv文件。
我不确定如何获取对象中元素的特定组件(请原谅我糟糕的描述)。例如," best"对象中的元素可以采用两个值。我能够在字符串定义中获得第一个,比如说[1],但我不知道如何在循环命令中引用它。
我也有一个关于面向对象语言的一般性问题。似乎一个对象可能包含很多元素。如何计算对象中有多少元素?
提前致谢!
答案 0 :(得分:1)
面向对象编程是一个广泛的主题,但是这里是如何生成CSV数据的。除了 best 之外,因为我不知道如何合并它。
def data = result.collect { name, map ->
def output = [name: name]
map.scores.flatten().each { output[it.ethnicity] = it.score }
return output
}
def ethnicities = result.collect { name, map -> map.scores.ethnicity }.flatten().unique().toSorted()
def records = data.collect { person -> [person.name, ethnicities.collect { ethnicity -> person[ethnicity] ?: 0 }].flatten() }
def csv = records.inject(new StringBuilder("name,${ethnicities.join(',')}\n")) { builder, it ->
builder.append it.join(',')
builder.append "\n"
return builder
}
data
是result
的转换,如下所示:
[
['name':'Barack Obama', 'Asian':'0.00', 'GreaterAfrican':'1.00', 'GreaterEuropean':'0.00', 'Africans':'1.00', 'Muslim':'0.00'],
['name':'George Washington', 'Asian':'0.07', 'GreaterAfrican':'0.00', 'GreaterEuropean':'0.93', 'British':'1.00', 'Jewish':'0.00', 'WestEuropean':'0.00', 'EastEuropean':'0.00'],
['name':'John Smith', 'Asian':'0.00', 'GreaterAfrican':'0.00', 'GreaterEuropean':'1.00', 'British':'1.00', 'Jewish':'0.00', 'WestEuropean':'0.00', 'EastEuropean':'0.00']
]
ethnicities
是JSON数据中所有种族的唯一列表。 records
是一个包含要以CSV格式写入的数据的列表。它为任何缺失的种族分数增加零值。它看起来像这样:
[
['Barack Obama', '1.00', '0.00', 0, 0, '1.00', '0.00', 0, '0.00', 0],
['George Washington', 0, '0.07', '1.00', '0.00', '0.00', '0.93', '0.00', 0, '0.00'],
['John Smith', 0, '0.00', '1.00', '0.00', '0.00', '1.00', '0.00', 0, '0.00']
]
最后,输出如下:
name,Africans,Asian,British,EastEuropean,GreaterAfrican,GreaterEuropean,Jewish,Muslim,WestEuropean
Barack Obama,1.00,0.00,0,0,1.00,0.00,0,0.00,0
George Washington,0,0.07,1.00,0.00,0.00,0.93,0.00,0,0.00
John Smith,0,0.00,1.00,0.00,0.00,1.00,0.00,0,0.00
result
是Map
,名称是键。所以你可以得到这样的名字:def name = result.keySet()[0]
result
对象支持Groovy的GPath。