使用其他嵌入的JSON对象展平JSON对象

时间:2013-03-06 15:15:14

标签: python ruby json

我有一个JSON对象,如下所示:

{
  "name":"bacon"
  "category":["food","meat","good"]
  "calories":"huge"
}

我试图将其压缩成一系列独特的值。我需要为Tableau构建一个事实表,它无法直接使用交叉表数据或JSON数据。

我不是在挑剔我是用Python还是用Ruby做的,但到目前为止我一直在尝试用Ruby做。我能够轻松地解析JSON并从中获取Ruby哈希,这似乎是首先要做的事情。

{"name"=>"bacon", "category"=>["food", "meat", "good"], "calories" => "huge"}

我需要制作这个:

name,category,calories
bacon,food,huge
bacon,meat,huge
bacon,good,huge

所以我认为我需要循环遍历该哈希并尝试取消嵌套。我一直在尝试这样的事情:

def Flatten(inHash)
    inHash.each do |key,value|
        if value.kind_of?(Hash)
            Flatten(value)
        else
            puts "#{value}"
        end 
    end 
end

但是,这似乎打印了所有的值,但它并没有重复之前的值。所以我得到的输出看起来像

bacon
food
meat
good
huge

是否存在内置方法或宝石或库,或者我是否正在从头开始构建?关于如何获得我想要的输出的任何想法?我会说Ruby和Python,所以如果你有Python答案请分享。

3 个答案:

答案 0 :(得分:2)

>>> #Assuming your json data is correctly formatted as is as follows
>>> data = '{ "name":"bacon", "category":["food","meat","good"], "calories":"huge" }'
>>> #Lets call our json parser as foo (I am bad with names)
>>> def foo(data):
    #You first need to parse it to a Py Object
    json_data = json.loads(data)
    from collections import namedtuple
    #Now create a namedtuple with the given keys of the dictionary
    food_matrix = namedtuple('food_matrix',json_data.keys())
    #And create a tuple out of the values
    data_tuple = food_matrix(*json_data.values())
    #Now with itertools.product create a cross product
    from itertools import product
    data_matrix = list(product([data_tuple.name],data_tuple.category, [data_tuple.calories]))
    # Now display the heading
    print "{:15}{:15}{:15}".format(["name","category","calories")
    # Now display the values
    for e in data_matrix:
        print "{:15}{:15}{:15}".format(*e)


>>> #Now call it
>>> foo(data)
name           category       calories                  
bacon          food           huge           
bacon          meat           huge           
bacon          good           huge           
>>> 

答案 1 :(得分:0)

假设您的JSON有逗号(使其成为valid JSON),您可以使用itertools.product枚举所有可能的组合:

import itertools as IT
import json

text = '{ "name":"bacon", "category":["food","meat","good"], "calories":"huge" }'
data = json.loads(text)

# Sort the keys in the order they appear in `text`
keys = sorted(data.keys(), key = lambda k: text.index(k))

# Promote the values to lists if they are not already lists
values = [data[k] if isinstance(data[k], list) else [data[k]] for k in keys]

print(','.join(keys))
for row in IT.product(*values):
    print(','.join(row))

产量

name,category,calories
bacon,food,huge
bacon,meat,huge
bacon,good,huge

答案 2 :(得分:0)

这是我的解决方案:

require 'json'

# Given a json object
json = JSON.parse('{"name":"bacon", "category":["food","meat","good"], "calories":"huge"}')

# First, normalize all the values to arrays
hash = Hash[json.map{|k, v| [k, [v].flatten]}]

# We now have a hash like {"name" => ["bacon"], ...}

# Then we'll make the product of the first array of values 
# (in this case, ["bacon"]) with the other values
permutations = hash.values[0].product(*hash.values[1..-1])

# Now just need to output
puts hash.keys.join(",")
permutations.each{ |group| puts group.join(",") }