python:自定义格式的json.dumps输出

时间:2017-12-25 03:52:55

标签: python python-3.x postgresql python-requests

我在postgresql中有一个表info_tbl

        Column        |          Type          | Modifiers 
----------------------+------------------------+-----------
 task_info            | character varying(100) | 
 timestamp            | date                   | 
 task_count           | integer                | 

所以我基本上对数据库“select * from info_tbl”进行获取操作,并使用json.dumps以json格式获取数据输出。但我得到的输出是这样的:

[
  {
    "task_info": "ABC",
    "timestamp": "2017-04-30",
    "task_count": 993
  },
  {
    "task_info": "PQR",
    "timestamp": "2017-05-31",
    "task_count": 413
  }
]

虽然我真正希望实现的是这样的:

[
  {
    "task_info": "ABC",
    "data_to_plot": [["2017-04-30", "993"],["2017-05-28", "624"],["2017-06-21", "811"]]
  },
  {
    "task_info": "PQR",
    "data_to_plot": [["2017-05-31","413"],["2017-06-16", "773"],["2017-07-21", "941"],["2017-08-30", "493"]]
  }
]

这些输出仅仅是为了理解,因此只提供了前两个记录,而实际表有超过1000个记录。 我将用这些来绘制图表。

1 个答案:

答案 0 :(得分:0)

事实证明@furas是正确的,重新格式化转储并因此使用pandas进行重新格式化并不是一个好习惯。我认为对于那些也在尝试准备数据作为HighCharts绘图输入的人来说,这将是有用的,从此,我发布了使用熊猫的解决方法。

import psycopg2
from psycopg2.extras import RealDictCursor

con = psycopg2.connect("dbname='yourDBname' user='yourUserName' host='yourAddressOrLocalhost' password='yourPassword'")
cur = con.cursor(cursor_factory=RealDictCursor)

query = "select * from info_tbl"
cur.execute(query)

df = pd.DataFrame(cur.fetchall(),index=None)
df['data_to_plot'] = df.apply(lambda row: [str(row['timestamp']) , row['task_count']], axis = 1)

result = df.groupby('task_info')['data_to_plot'].apply(list).reset_index().to_json(orient='records')
#result
print(json.dumps(json.loads(result),indent=2))

输出:

'''
    [
      {
        "task_info": "ABC",
        "data_to_plot": [["2017-04-30", "993"],["2017-05-28", "624"],["2017-06-21", "811"]]
      },
      {
        "task_info": "PQR",
        "data_to_plot": [["2017-05-31","413"],["2017-06-16", "773"],["2017-07-21", "941"],["2017-08-30", "493"]]
      }
    ]
'''