熊猫到JSON无法以正确的格式获取它

时间:2019-05-17 17:09:59

标签: python json pandas

这真的让我感到沮丧,我觉得我已经尝试了一切。我有一个基本的Pandas数据框,如下所示:

order   name        lat     long    open    close
123     Walgreens   37.5    50.4    08:00:00    17:00:00
456     CVS         16.7    52.4    09:00:00    12:00:00
789     McDonald's  90.7    59.1    12:00:00    14:00:00 

我需要将该数据帧转换为如下所示的JSON对象:

    {
      "123": {
    "Location": {
      "Name": "Walgreens",
      "Lat": 37.5,
      "Long": 50.4
    },
    "Open": 08:00:00,
    "Close": 17:00:00
  },
  "456": {
    "Location": {
      "Name": "CVS",
      "Lat": 16.7,
      "Long": 52.4
    },
    "Open": 09:00:00,
    "Close": 12:00:00
  },
  "789": {
    "Location": {
      "Name": "McDonald's", 
      "Lat": 90.7, 
      "Long":   59.1
     }, 
     "Open": 12:00:00, 
    "Close" : 14:00:00 } } }

我尝试了很多方法来使它看起来像这样,但是要么我被多余的斜杠卡住了,要么无论做什么我都无法正确引用我的报价。我已经完成了Pandas的to_json方法,并使其成为一个字典,然后完成了json.loads或json.dumps,但是它不能正常工作。

我尝试过的一种方法是这样做:

json_dict = {}

    for i in df.index:
        order_no = df.loc[i, 'order_no']
        stop_name = df.loc[i, 'Name']
        lat = df.loc[i, 'latitude']
        lng = df.loc[i, 'longitude']   
        start = df.loc[i, 'start']
        end = df.loc[i, 'end']
        json_dict[str(order_no)] = '{{"location" : {{  "name":  "{0}", 
        "lat" : "{1}", "long" : "{2}" }}, "open" : "{3}", "close" : "{4}"  
         }}'.format(name, lat, long, start, end)

      json.dumps(json_dict) 

并最终在其中添加大量反斜杠。如何正确设置格式?谢谢你的帮助!

3 个答案:

答案 0 :(得分:2)

带有源数据帧df,如下所示:

order   name        lat     long    open        close
123     Walgreens   37.5    50.4    08:00:00    17:00:00
456     CVS         16.7    52.4    09:00:00    12:00:00
789     McDonald's  90.7    59.1    12:00:00    14:00:00 

要获取所需的输出json,我们需要执行以下操作:

  • 将列名称转换为大写字母
  • 创建字典类型的Location列,聚合namelatlong
  • 转换为json,其中order是顶级密钥

代码:

# import json & pprint to pretty print the output
import json
import pprint

import pandas as pd

df.columns = [x.capitalize() for x in df.columns]
location_keys = ['Name', 'Lat', 'Long']
df['Location'] = df[location_keys].to_dict(orient='records')  
json_str = df.set_index('Order').drop(location_keys, axis=1).to_json(orient='index')

# print output with nice json formatting
pprint.pprint(json.loads(json_str))
# outputs:
{'123': {'Close': '17:00:00',
         'Location': {'Lat': '37.5', 'Long': '50.4', 'Name': 'Walgreens'},
         'Open': '08:00:00'},
 '456': {'Close': '12:00:00',
         'Location': {'Lat': '16.7', 'Long': '52.4', 'Name': 'CVS'},
         'Open': '09:00:00'},
 '789': {'Close': '14:00:00',
         'Location': {'Lat': '90.7', 'Long': '59.1', 'Name': "McDonald's"},
         'Open': '12:00:00'}}

答案 1 :(得分:1)

如果将索引设置为order,则可以在index上定位:

import pandas as pd

records
[{'order': '123', 'name': 'Walgreens', 'lat': '37.5', 'long': '50.4', 'open': '08:00:00', 'close': '17:00:00'}, {'order': '456', 'name': 'CVS', 'lat': '16.7', 'long': '52.4', 'open': '09:00:00', 'close': '12:00:00'}, {'order': '789', 'name': "McDonald's", 'lat': '90.7', 'long': '59.1', 'open': '12:00:00', 'close': '14:00:00'}]

df = pd.DataFrame(records)
df = df.set_index('order')

现在df看起来像


close   lat  long        name      open
order
123    17:00:00  37.5  50.4   Walgreens  08:00:00
456    12:00:00  16.7  52.4         CVS  09:00:00
789    14:00:00  90.7  59.1  McDonald's  12:00:00

将其获取到python dict

df.to_dict(orient='index')

{
   "123": {
      "close": "17:00:00",
      "lat": "37.5",
      "long": "50.4",
      "name": "Walgreens",
      "open": "08:00:00"
   },
   "456": {
      "close": "12:00:00",
      "lat": "16.7",
      "long": "52.4",
      "name": "CVS",
      "open": "09:00:00"
   },
   "789": {
      "close": "14:00:00",
      "lat": "90.7",
      "long": "59.1",
      "name": "McDonald's",
      "open": "12:00:00"
   }
}

作为完整的陈述

# if you prefer a one-liner

# as python dict
json_dict = df.set_index('order').to_dict(orient='index')

# or as json string
json_string = df.set_index('order').to_json(orient='index')

答案 2 :(得分:0)

tl;博士

我在尝试从 Pandas 数据帧中获取正确的 JSON 格式时遇到了类似的困难,我想用它来驱动 API。我从我们通常如何使用 SQL 中得到了提示,在使用内置函数转换为日期之前,我们将有问题的日期值解析为字符串。 ... 你可以考虑做

json.dumps(json.loads(data_frame.to_json(orient="records")))

如果有帮助