我有一个像这样格式化的json dict
{"cache_age_milliseconds": 0, "rows": [{"values": [["Sonos_HXXpu71TY1g4HWWU2jXCJ8tcKu", 1483225200000, "87.61.241.100", "*null*"], 0.3605555555555556]}, {"values": [["Sonos_HXXpu71TY1g4HWWU2jXCJ8tcKu", 1483221600000, "87.61.241.100", "*null*"], 0.35888888888888887]}], "columns": [{"type": "array", "label": ["Household ID", "__hour__", "ip", "SerialNumber.Config.RoomType"]}, {"type": "number", "label": "measure_value"}]}
将此加载到数据框中的最快方法是什么?
答案 0 :(得分:3)
我可能在这里错了,但这接近你想要的吗? 因为我肯定能找到一种不那么脏的方法,但输出在这里很重要。
import pandas as pd
data = {"cache_age_milliseconds": 0, "rows": [{"values": [["Sonos_HXXpu71TY1g4HWWU2jXCJ8tcKu", 1483225200000, "87.61.241.100", "*null*"], 0.3605555555555556]}, {"values": [["Sonos_HXXpu71TY1g4HWWU2jXCJ8tcKu", 1483221600000, "87.61.241.100", "*null*"], 0.35888888888888887]}], "columns": [{"type": "array", "label": ["Household ID", "__hour__", "ip", "SerialNumber.Config.RoomType"]}, {"type": "number", "label": "measure_value"}]}
df = pd.DataFrame.from_dict([i["values"][0] for i in data["rows"]])
df.columns = data["columns"][0]["label"]
df.index = [i["values"][1] for i in data["rows"]]
df.index.name = data["columns"][1]["label"]
结果如下:
Household ID __hour__ ip SerialNumber.Config.RoomType
measure_value
0.360556 Sonos_HXXpu71TY1g4HWWU2jXCJ8tcKu 1483225200000 87.61.241.100 *null*
0.358889 Sonos_HXXpu71TY1g4HWWU2jXCJ8tcKu 1483221600000 87.61.241.100 *null*