将字典字符串转换为字典列表以插入到mongodb

时间:2018-09-13 12:37:30

标签: python mongodb list dictionary

我有以下字符串数据(dict_string),没有用逗号或其他任何东西分隔,但每行以\ n结尾:

data = {"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109897,"Title":"Prop 1","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: Some title data might me data \n  some link http:\\www.ggogle\.com with some sepcial characters >< ? // ","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}\n
       {"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109890,"Title":"Prop 2","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: Some title data might me data \n  some link http:\\www.ggogle\.com with some sepcial characters >< ? //","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}\n

我想将其转换为字典列表:

[{"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109897,"Title":"Prop 1","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: Some title data might me data \n  some link http:\\www.ggogle\.com with some sepcial characters >< ? // ","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"},
{"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109890,"Title":"Prop 2","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: Some title data might me data \n  some link http:\\www.ggogle\.com with some sepcial characters >< ? //","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}]

以便我可以将其插入到mongodb中。

我尝试替换,然后如下拆分

data = data.replace("\n{", "|{")
data = data.split("|")

但这会生成以\ n结尾的字符串列表,例如:['{}','{}'...,\ n]

Eval抛出字符串文字错误。

我该如何实现?我有机会使用json加载或其他方法吗?

1 个答案:

答案 0 :(得分:0)

将正则表达式与 Group Rainfall Flood_freq 0 Jan 115.679997 0 1 Jan 72.929999 0 2 Jan 39.719999 0 3 Jan 46.799999 1 4 Jan 54.989998 0 ... 212 Dec 51.599998 0 213 Dec 45.359999 0 214 Dec 10.260000 0 215 Dec 52.709998 0 模块一起使用。

例如:

dd=pd.melt(FBPdf,id_vars=['Group'],value_vars=['Rainfall','Flood_freq'],var_name='Data')
sns.boxplot(x='Group',y='value',data=dd,hue='Data')

输出:

ast

import re
import ast

data = '''{"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109897,"Title":"Prop 1","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: Some title data might me data \n  some link http:\\www.ggogle\.com with some sepcial characters >< ? // ","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}
       {"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109890,"Title":"Prop 2","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: Some title data might me data \n  some link http:\\www.ggogle\.com with some sepcial characters >< ? //","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}'''

for i in re.findall(r"\{.*?\}", data.replace('\r', '').replace('\n', ''), flags=re.DOTALL):
    print(ast.literal_eval(i))