我有一个JSON对象,如下所示:
{ "name":"bacon" "category":["food","meat","good"] "calories":"huge" }
我试图将其压缩成一系列独特的值。我需要为Tableau构建一个事实表,它无法直接使用交叉表数据或JSON数据。
我不是在挑剔我是用Python还是用Ruby做的,但到目前为止我一直在尝试用Ruby做。我能够轻松地解析JSON并从中获取Ruby哈希,这似乎是首先要做的事情。
{"name"=>"bacon", "category"=>["food", "meat", "good"], "calories" => "huge"}
我需要制作这个:
name,category,calories
bacon,food,huge
bacon,meat,huge
bacon,good,huge
所以我认为我需要循环遍历该哈希并尝试取消嵌套。我一直在尝试这样的事情:
def Flatten(inHash)
inHash.each do |key,value|
if value.kind_of?(Hash)
Flatten(value)
else
puts "#{value}"
end
end
end
但是,这似乎打印了所有的值,但它并没有重复之前的值。所以我得到的输出看起来像
bacon
food
meat
good
huge
是否存在内置方法或宝石或库,或者我是否正在从头开始构建?关于如何获得我想要的输出的任何想法?我会说Ruby和Python,所以如果你有Python答案请分享。
答案 0 :(得分:2)
>>> #Assuming your json data is correctly formatted as is as follows
>>> data = '{ "name":"bacon", "category":["food","meat","good"], "calories":"huge" }'
>>> #Lets call our json parser as foo (I am bad with names)
>>> def foo(data):
#You first need to parse it to a Py Object
json_data = json.loads(data)
from collections import namedtuple
#Now create a namedtuple with the given keys of the dictionary
food_matrix = namedtuple('food_matrix',json_data.keys())
#And create a tuple out of the values
data_tuple = food_matrix(*json_data.values())
#Now with itertools.product create a cross product
from itertools import product
data_matrix = list(product([data_tuple.name],data_tuple.category, [data_tuple.calories]))
# Now display the heading
print "{:15}{:15}{:15}".format(["name","category","calories")
# Now display the values
for e in data_matrix:
print "{:15}{:15}{:15}".format(*e)
>>> #Now call it
>>> foo(data)
name category calories
bacon food huge
bacon meat huge
bacon good huge
>>>
答案 1 :(得分:0)
假设您的JSON有逗号(使其成为valid JSON),您可以使用itertools.product枚举所有可能的组合:
import itertools as IT
import json
text = '{ "name":"bacon", "category":["food","meat","good"], "calories":"huge" }'
data = json.loads(text)
# Sort the keys in the order they appear in `text`
keys = sorted(data.keys(), key = lambda k: text.index(k))
# Promote the values to lists if they are not already lists
values = [data[k] if isinstance(data[k], list) else [data[k]] for k in keys]
print(','.join(keys))
for row in IT.product(*values):
print(','.join(row))
产量
name,category,calories
bacon,food,huge
bacon,meat,huge
bacon,good,huge
答案 2 :(得分:0)
这是我的解决方案:
require 'json'
# Given a json object
json = JSON.parse('{"name":"bacon", "category":["food","meat","good"], "calories":"huge"}')
# First, normalize all the values to arrays
hash = Hash[json.map{|k, v| [k, [v].flatten]}]
# We now have a hash like {"name" => ["bacon"], ...}
# Then we'll make the product of the first array of values
# (in this case, ["bacon"]) with the other values
permutations = hash.values[0].product(*hash.values[1..-1])
# Now just need to output
puts hash.keys.join(",")
permutations.each{ |group| puts group.join(",") }