我想转换:
datalist = [u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg'}",
u"{gallery: 'gal1', smallimage: 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg',largeimage: 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg'}"]
列出包含python dict。如果我尝试使用关键字提取值我得到了这个错误:
for i in datalist:
print i['smallimage']
....:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-20-686ea4feba66> in <module>()
1 for i in datalist:
----> 2 print i['smallimage']
3
TypeError: string indices must be integers
如何将包含Unicode Dict的列表转换为Dict ..
答案 0 :(得分:8)
您可以使用demjson模块,该模块具有处理您所拥有数据的非严格模式:
import demjson
for data in datalist:
dct = demjson.decode(data)
print dct['gallery'] # etc...
答案 1 :(得分:3)
在这种情况下,我会手工制作一个正则表达式,使它们成为可以用Python评估的东西:
import re
import ast
from functools import partial
keys = re.compile(r'(gallery|smallimage|largeimage)')
fix_keys = partial(keys.sub, r'"\1"')
for entry in datalist:
entry = ast.literal_eval(fix_keys(entry))
是的,这是有限的;但它适用于此集,并且只要密钥匹配就很健壮。正则表达式是 simple 来维护。此外,这不使用任何外部依赖,它都是基于已经包含的电池。
结果:
>>> for entry in datalist:
... print ast.literal_eval(fix_keys(entry))
...
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/3/_/3_13.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/3/_/3_13.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/_/5_3_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/5/_/5_3_1.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_22.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/1/_/1_22.jpg'}
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/4/_/4_7_1.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/4/_/4_7_1.jpg'}
答案 2 :(得分:3)
正如另一个想法,你的列表格式正确Yaml。
> yaml.load(u'{foo: "bar"}')['foo']
'bar'
如果你想真正想要并一次解析所有内容:
> data = yaml.load('['+','.join(datalist)+']')
> data[0]['smallimage']
'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'
> data[3]['gallery']
'gal1'
答案 3 :(得分:2)
如果您的字典键被引用,您可以
使用json.loads
加载字符串。
import json
for i in datalist:
print json.loads(i)['smallimage']
(ast.literal_eval
也会奏效......)
然而,实际上,这适用于旧学校eval
:
>>> class Mdict(dict):
... def __missing__(self,key):
... return key
...
>>> eval(datalist[0],Mdict(__builtins__=None))
{'largeimage': 'http://www.styleever.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/_/2_12.jpg', 'gallery': 'gal1', 'smallimage': 'http://www.styleever.com/media/catalog/product/cache/1/small_image/445x370/17f82f742ffe127f42dca9de82fb58b1/2/_/2_12.jpg'}
请注意,这可能容易受到注入攻击,因此只有在字符串来自可靠来源时才使用它。
最后,对于任何想要一个简短的,虽然有点密集的解决方案,只使用标准库并且不易受到注入攻击的人...这个小宝石可以解决这个问题(假设字典键是有效的标识符)!
import ast
class RewriteName(ast.NodeTransformer):
def visit_Name(self,node):
return ast.Str(s=node.id)
transformer = RewriteName()
for x in datalist:
tree = ast.parse(x,mode='eval')
transformer.visit(tree)
print ast.literal_eval(tree)['smallimage']
答案 4 :(得分:0)
您的datalist是list
unicode字符串。
您可以使用eval
,但您的密钥未正确引用。您可以做的是使用replace
动态重新引用您的密钥:
for i in datalist:
my_dict = eval(i.replace("gallery", "'gallery'").replace("smallimage", "'smallimage'").replace("largeimage", "'largeimage'"))
print my_dict["smallimage"]
答案 5 :(得分:-4)
我不明白为什么需要所有额外的东西,例如使用re
或json
......
fdict = {str(k): v for (k, v) in udict.items()}
udict
是具有dict
个密钥的unicode
。只需将它们转换为str
即可。在您的给定数据中,您可以简单地......
datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]
简单测试:
>>> datalist = [{u'a':1,u'b':2},{u'a':1,u'b':2}]
[{u'a': 1, u'b': 2}, {u'a': 1, u'b': 2}]
>>> datalist = [dict((str(k), v) for (k, v) in i.items()) for i in datalist]
>>> datalist
[{'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
否import re
或import json
。简单快捷。