我有多个不以逗号分隔的字典,并且类型是字符串,是否可以将它们分开并在列表中获得漂亮的元素,每个元素代表一个字典。
例如:我所拥有的:{} {} {}
我想要的是[{},{},{}]
我知道它类似于Want to separate list of dictionaries not separated by comma,但是我不想调用子进程并调用sed。
示例:
data = {"key1":"val1", "key2":"val2", "key3":"val3", "key4":"val4"} {"key1":"someval", "key2":"someval", "key3":"someval", "key4":"someval"} {"key1":"someval", "key2":"someval", "key3":"someval", "key4":"someval"}
what i want is :
[{"key1":"val1", "key2":"val2", "key3":"val3", "key4":"val4"} ,
{"key1":"someval", "key2":"someval", "key3":"someval", "key4":"someval"},
{"key1":"someval", "key2":"someval", "key3":"someval", "key4":"someval"}]
我如何实现这一目标。
示例2:
string = '''{"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109897,"Title":"Prop 1","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: {some data \n some link http:\\www.ggogle\.com with some sepcial characters ">< ?? // {} [] ;;}","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}
{"Date1":"2017-02-13T00:00:00.000Z","peerval":"222.22000","PID":109890,"Title":"Prop 2","Temp":5,"Temp Actual":5,"Temp Predicted":3.9,"Level":"Medium","Explaination":"Source: {some data \n some link http:\\www.ggogle\.com with some sepcial characters ">< ?? // {} [] ;;}","creator":"\\etc\\someid","createdtime" :"2017-02-12T15:24:38.380Z"}
'''
注意:每个字典都以$(换行符)
结尾答案 0 :(得分:4)
这种方法有点慢(相对于字符串长度,大约为O(N ^ 2)),但是它可以处理非常复杂的文字语法,包括嵌套的数据结构。在ast.literal_eval
的依次较小的切片中循环调用s
,直到找到语法上有效的切片。然后删除该片并继续直到字符串为空。
import ast
def parse_consecutive_literals(s):
result = []
while s:
for i in range(len(s), 0, -1):
#print(i, repr(s), repr(s[:i]), len(result))
try:
obj = ast.literal_eval(s[:i])
except SyntaxError:
continue
else:
result.append(obj)
s = s[i:].strip()
break
else:
raise Exception("Couldn't parse remainder of string: " + repr(s))
return result
test_cases = [
"{} {} {}",
"{}{}{}",
"{1: 2, 3: 4}{5:6, '7': [8, {9: 10}]}",
"[11] 'twelve' 13 14.0",
"{} {\"'hi '}'there\"} {'whats \"}\"{\"up'}",
"{1: 'foo\\'}bar'}"
]
for s in test_cases:
print("{} parses into {}".format(repr(s), parse_consecutive_literals(s)))
结果:
'{} {} {}' parses into [{}, {}, {}]
'{}{}{}' parses into [{}, {}, {}]
"{1: 2, 3: 4}{5:6, '7': [8, {9: 10}]}" parses into [{1: 2, 3: 4}, {5: 6, '7': [8, {9: 10}]}]
"[11] 'twelve' 13 14.0" parses into [[11], 'twelve', 13, 14.0]
'{} {"\'hi \'}\'there"} {\'whats "}"{"up\'}' parses into [{}, {"'hi '}'there"}, {'whats "}"{"up'}]
"{1: 'foo\\'}bar'}" parses into [{1: "foo'}bar"}]
但是,我并不热衷于将此解决方案用于生产质量代码。首先,以更合理的格式序列化数据会更好,例如json。
答案 1 :(得分:3)
对于书名空间,运行时为 O(n),使用Python库针对字符串连接进行了优化,并且没有开销:
.+
输出
.*
我也做了一些时间安排
def fetch_until(sep, char_iter):
chars = []
escapes = 0
while True:
try:
c = next(char_iter)
except StopIteration:
break
if c == "\\":
escapes += 1
chars.append(c)
if c == sep:
if escapes % 2 == 0:
break
if c != "\\":
escapes = 0
return chars
def fix(data):
brace_level = 0
result = []
char_iter = iter(data)
try:
while True:
c = next(char_iter)
result.append(c)
if c in ("'", '"'):
result.extend(fetch_until(c, char_iter))
elif c == "{":
brace_level += 1
elif c == "}":
brace_level -= 1
if brace_level == 0:
result.append(",")
except StopIteration:
pass
return eval("[{}]".format("".join(result[:-1])))
test_cases = [
"{1: 'foo\\'}bar'}",
"{} {\"'hi '}'there\"} {'whats \"}\"{\"up'}",
"{}{}{}",
"{1: 2, 3: 4}{5:6, '7': [8, {9: 10}]}",
"{1: {}} {2:3, 4:{}} {(1,)}",
"{1: 'foo'} {'bar'}",
]
for test_case in test_cases:
print("{!r:40s} -> {!r}".format(test_case, fix(test_case)))
打印(在我的慢速Macbook上):
"{1: 'foo\\'}bar'}" -> [{1: "foo'}bar"}]
'{} {"\'hi \'}\'there"} {\'whats "}"{"up\'}' -> [{}, {"'hi '}'there"}, {'whats "}"{"up'}]
'{}{}{}' -> [{}, {}, {}]
"{1: 2, 3: 4}{5:6, '7': [8, {9: 10}]}" -> [{1: 2, 3: 4}, {5: 6, '7': [8, {9: 10}]}]
'{1: {}} {2:3, 4:{}} {(1,)}' -> [{1: {}}, {2: 3, 4: {}}, {(1,)}]
"{1: 'foo'} {'bar'}" -> [{1: 'foo'}, {'bar'}]
答案 2 :(得分:-1)
您可以将其转换为有效的json字符串,然后很容易做到这一点。
import json
mydict_string = mydict_string.replace(' {', ',{')
mylist = json.loads(mydict_string)
否则,尽管我不推荐,但您也可以使用eval。
mylist = map(eval, mydict_string.split(' '))
即使内部字典不为空,这也将起作用。
答案 3 :(得分:-1)
假设dict_string
是您的输入字符串,则可以尝试
import json
my_dicts = [json.loads(i) for i in dict_string.replace(", ",",").split()]