我有一个python字典,如下所示:
{'key1' : ['1640', 315], 'key2' : ['1638', 750, 340, '1639', 450, 300]}
我想转换嵌套字典中的列表,其中字符串(' 1640'' 1638'' 1639')是关键字和以下整数是相应的价值观。
我已经尝试了itertools izip,它可以工作,但只有第一个元素,所以我回来了
{key1 : {'1640' : 315}, key2 : {'1638' : 750}}
,而所需的输出是:
{key1: {'1640' : 315}, key2 : {'1638' : [750, 350], '1639' : [450, 300}}
编辑:
我需要解决的问题是解析项目的最后一步。到目前为止,这是项目第一部分的代码。
Book_chap = open('file.htm').read()
#get a list of all the strings between paragraphs tags
pattern1 = re.compile(r'\<p>(.*?)\</p>')
list_of_strings = re.findall(pattern1, Book_chap)
my_dict = {}
#dict2 = {}
for element in list_of_strings[1:]: #do not consider the first paragaph
substitute = element.replace('�', '@') #replace unrecognized character with snail
names = re.compile(r'(\@.\s?[A-Z][a-z]+|[A-Z][a-z]+(?=\s[A-Z])(?:\s[A-Z][a-z]+)+)') #regex to get names + surnames or @ + surnames
names_ = re.findall(names, substitute)
years_and_cases = re.compile(r'\s\d{1,4}\:.*?\.')
yc_find = re.findall(years_and_cases, substitute)
for k in names_:
for v in yc_find:
out = "".join(c for c in v if c not in ('!','.',':', ',')) #remove punctuation from the string, keep only the numbers
out = out.split() #get tokens
out = [int(i) if len(i)<4 else i for i in out] #{k: v } k type = string, v type = int
my_dict[k] = out
#out = iter(out)
#outt = dict(izip(out, out))
#dict2[k] = outt this spits back only the first key element in the list plus the first value bound to that key
print my_dict
我考虑过使用值列表中元素的类型:
d2 = {}
for k, v in my_dict.items():
for item in v:
if type(item) == 'str':
d2[k][item] = {} #string items are keys in the nested dict
else:
#integer items are values (?)
即使对于第一部分,这也不起作用:我得到第二个字典与第一个字符具有相同的键,值是一个空的嵌套字典我没有得到内部键(应该是是字符串)。 我已经检查了一些关于字典理解的先前问题,但在我看来它们使用步骤/定期间隔并且我的值列表是不规则的(我可能有一个或多个键,每个键可能有一个或多个值)