我收到了Mixpanel API的原始数据。我希望将其转换为CSV文件,以便我可以在Excel中操作数据。我试过这个在线工具(http://jsfiddle.net/sturtevant/vUnF9/),但似乎没有处理嵌套的json结果。做这个的最好方式是什么?
以下是示例输出:
{"event":"Event.Name","properties":{"time":1376784014,"distinct_id":"distinctID","$app_version":"1.XX","$city":"cityName","$ios_ifa":"iosIfa","$lib_version":"X.Y.Z","$manufacturer":"Apple","$model":"model","$os":"iPhone OS","$os_version":"X.Y.Z","$region":"Region","$screen_height":999,"$screen_width":999,"$wifi":true,"App Version":"1.XX","BattleDuration":"99","BattleNum":"2","Episode Num":"2","PlayerVictory":"1","mp_country_code":"CODE","mp_device_model":"Model","mp_lib":"iphone"}}
答案 0 :(得分:1)
我猜这只是您可能正在处理的许多记录之一。基本上,您需要将JSON对象转换为更平坦的对象而不会嵌套,同时不会丢失键及其关系。
此...
{
"event":"Event.Name",
"properties":{
"time":1376784014,
"distinct_id":"distinctID",
....
....
}
可以转换为...(您可以将_替换为任何其他分隔符)
{
"mixpanel_event":"Event.Name",
"mixpanel_properties_time":"1376784014",
"mixpanel_properties_distinct_id":"distinctID",
....
....
}
然后,您可以使用csv.DictWriter将此结构写入csv文件。
你可以使用像这样的递归函数......
def reduce_item(key, value):
global reduced_item
#Reduction Condition 1
if type(value) is list:
i=0
for sub_item in value:
reduce_item(key+'_'+str(i), sub_item)
i=i+1
#Reduction Condition 2
elif type(value) is dict:
sub_keys = value.keys()
for sub_key in sub_keys:
reduce_item(key+'_'+str(sub_key), value[sub_key])
#Base Condition
else:
reduced_item[str(key)] = str(value)
然后你可以把这个函数叫做......
raw_data = json.loads("your_json_string")
reduced_item = {}
reduce_item("mixpanel", raw_data)
答案 1 :(得分:0)
您可以尝试下面的示例代码。您可以使用递归函数来获取键和值(您必须以某种方式确保维护订单)
import sys
import json
def getKeys(newDict):
retv = []
for key in newDict.keys():
try:
keyForEmbeddedDict = newDict[key].keys()
retv.extend(getKeys(newDict[key]))
except AttributeError:
retv.append(key)
return retv
def getValues(newDict):
retv = []
for key in newDict.keys():
try:
keyForEmbeddedDict = newDict[key].keys()
retv.extend(getValues(newDict[key]))
except AttributeError:
retv.append(newDict[key])
return retv
def main():
t = {}
filename = '' # Add your filename
with open(filename) as f:
t = json.load(f)
keys = getKeys(t)
result = getValues(t)
print keys
print result
return
if __name__ == '__main__':
main()
sys.exit(0)