我在postgresql中有一个表info_tbl
Column | Type | Modifiers
----------------------+------------------------+-----------
task_info | character varying(100) |
timestamp | date |
task_count | integer |
所以我基本上对数据库“select * from info_tbl”进行获取操作,并使用json.dumps以json格式获取数据输出。但我得到的输出是这样的:
[
{
"task_info": "ABC",
"timestamp": "2017-04-30",
"task_count": 993
},
{
"task_info": "PQR",
"timestamp": "2017-05-31",
"task_count": 413
}
]
虽然我真正希望实现的是这样的:
[
{
"task_info": "ABC",
"data_to_plot": [["2017-04-30", "993"],["2017-05-28", "624"],["2017-06-21", "811"]]
},
{
"task_info": "PQR",
"data_to_plot": [["2017-05-31","413"],["2017-06-16", "773"],["2017-07-21", "941"],["2017-08-30", "493"]]
}
]
这些输出仅仅是为了理解,因此只提供了前两个记录,而实际表有超过1000个记录。 我将用这些来绘制图表。
答案 0 :(得分:0)
事实证明@furas是正确的,重新格式化转储并因此使用pandas进行重新格式化并不是一个好习惯。我认为对于那些也在尝试准备数据作为HighCharts绘图输入的人来说,这将是有用的,从此,我发布了使用熊猫的解决方法。
import psycopg2
from psycopg2.extras import RealDictCursor
con = psycopg2.connect("dbname='yourDBname' user='yourUserName' host='yourAddressOrLocalhost' password='yourPassword'")
cur = con.cursor(cursor_factory=RealDictCursor)
query = "select * from info_tbl"
cur.execute(query)
df = pd.DataFrame(cur.fetchall(),index=None)
df['data_to_plot'] = df.apply(lambda row: [str(row['timestamp']) , row['task_count']], axis = 1)
result = df.groupby('task_info')['data_to_plot'].apply(list).reset_index().to_json(orient='records')
#result
print(json.dumps(json.loads(result),indent=2))
输出:
'''
[
{
"task_info": "ABC",
"data_to_plot": [["2017-04-30", "993"],["2017-05-28", "624"],["2017-06-21", "811"]]
},
{
"task_info": "PQR",
"data_to_plot": [["2017-05-31","413"],["2017-06-16", "773"],["2017-07-21", "941"],["2017-08-30", "493"]]
}
]
'''