我的问题是我的程序每次在for循环中进行迭代,是否有可能更改其正在处理的字典(在这种情况下,将信息添加到其中)。即第一次迭代dict_1,然后迭代dict_2,等等。
def getDicts(aFile):
voteFile = open(aFile)
listDicts = [{},{},{},{},{},{},{},{},{},{},{}]
i = 0
for line in voteFile:
lineSplit = line.split(':')
if len(lineSplit) > 1:
key = lineSplit[0].strip()
value = lineSplit[1].strip()
listDicts[i][key] = value
else:
i += 1
return listDicts
程序正在处理的文件包含文本块,其中每一行都有两个用冒号分隔的术语。每个块之后都有一个新行。这就是为什么有while循环检查每行上的术语长度是否不是2的原因。当程序退出while循环时,我希望它将完整的字典(dict_1)添加到字典列表中,然后重新开始在.txt文件中的下一个文本块上,这次将信息添加到dict_2。
按要求填写.txt数据:
_Constituency:East Midlands
_Seats:5
Brexit Party:452321
Liberal Democrats:203989
Labour:164682
Conservative:126138
Green:124630
UKIP:58198
Change UK:41117
Independent Network:7641
Simon Rood (Independent):4511
_Constituency:East of England
_Seats:7
Brexit Party:604715
Liberal Democrats:361563
Green:202460
Conservative:163830
Labour:139490
Change UK:58274
UKIP:54676
English Democrat:10217
Attila Csordas (Independent):3230
_Constituency:London
_Seats:8
Liberal Democrats:608725
Labour:536810
Brexit Party:400257
Green:278957
Conservative:177964
Change UK:117635
UKIP:46497
Animal Welfare:25232
Women's Equality:23766
UK EU:18806
Claudia Mcdowell (Independent):1036
Daze Aghaji (Independent):1018
Roger Hallam (Independent):924
Kofi Klu (Independent):869
Andrea Venzon (Independent):731
Mike Shad (Independent):707
Zoe Lafferty (Independent):436
Andrew Medhurst (Independent):430
Alan Kirkby (Independent):401
Ian Sowden (Independent):254
Henry Muss (Independent):226
_Constituency:North East England
_Seats:3
Brexit Party:240056
Labour:119931
Liberal Democrats:104330
Green:49905
Conservative:42395
UKIP:38269
Change UK:24968
_Constituency:North West England
_Seats:8
Brexit Party:541843
Labour:380193
Liberal Democrats:297507
Green:216581
Conservative:131002
UKIP:62464
Change UK:47237
Tommy Robinson (Independent):38908
English Democrat:10045
UK EU:7125
Mohammad Aslam (Independent):2002
_Constituency:South East England
_Seats:10
Brexit Party:915686
Liberal Democrats:653743
Green:343249
Conservative:260277
Labour:184678
Change UK:105832
UKIP:56487
UK EU:7645
Jason Guy Spencer McMahon (Independent):3650
Socialist (GB):3505
David Victor Round (Independent):2606
Michael Jeffrey Turberville (Independent):1587
_Constituency:South West England
_Seats:6
Brexit Party:611742
Liberal Democrats:385095
Green:302364
Conservative:144674
Labour:108100
UKIP:53739
Change UK:46612
English Democrat:8393
Larch Maxey (Independent):1772
Mothiur Rahman (Independent):755
Neville Seed (Independent):3383
_Constituency:West Midlands
_Seats:7
Brexit Party:507152
Labour:228298
Liberal Democrats:219982
Green:143520
Conservative:135279
UKIP:66934
Change UK:45673
_Constituency:Yorkshire and the Humber
_Seats:6
Brexit Party:470351
Labour:210516
Liberal Democrats:200180
Green:166980
Conservative:92863
UKIP:56100
Yorkshire Party:50842
Change UK:30162
English Democrat:11283
_Constituency:Scotland
_Seats:6
SNP:594553
Brexit Party:233006
Liberal Democrats:218285
Conservative:182476
Labour:146724
Scottish Green:129603
Change UK:30004
UKIP:28418
Gordon Edgar (Independent):6128
Ken Parke (Independent):2049
_Constituency:Wales
_Seats:4
Brexit Party:271404
Plaid Cymru:163928
Labour:127833
Liberal Democrats:113885
Conservative:54587
Green:52660
UKIP:27566
Change UK:24332
答案 0 :(得分:-1)
这是我的建议:
def getConstituencies(aFile):
all_data = {}
constituency_keys = []
constituency_idx = []
# Store all of the information in the text file
with open(aFile, 'rb') as f:
lines = f.readlines()
# Go through the stored info and find all of the constituencies and corresponding indexes
for idx, line in enumerate(lines):
if line.split(':')[0] = '_Constituency':
constituency_keys.append(line.split(':')[1])
constituency_idx.append(idx)
# Now go through the stored lines and from the specified index until a blank line, store all the related info into a dictionary.
# Then, nest that dictionary into the all_data dictionary
for idx, c in zip(constituency_idx, constituency_keys):
temp_dict = {}
for line in lines[idx+1:]:
if line == "":
break
temp_dict[line.split(':')[0]] = line.split(':')[1]
all_data[c] = temp_dict
然后访问信息就是查询选区的问题:
all_data['East Midlands']['Green']
我还建议您查看Pandas,以便开始将数据分类到DataFrame中-这将使您的数据处理工作变得更加轻松
其他用户的建议和反馈。
答案 1 :(得分:-1)
作为计算机科学家,我个人喜欢保持从0开始而不是从1开始的惯例。例如,dict_1到dict_11现在是dict_0到dict10。
我不确定为什么要在代码中使用while循环,我认为if语句就足够了。
无论如何,这里有一些代码适合您。它只使用少量的字典,但它可以满足您的要求(假设您总是有11行数据,然后是空行)。
顺便说一句,%是模块划分运算符。它返回剩余的股息。因此5%4 = 1和4%3 = 1
test.txt
key1:val1
key2:val2
key3:val3
key4:val4
key5:val5
key6:val6
脚本
voteFile = open('test.txt')
dict_0 = {}
dict_1 = {}
dict_2 = {}
listDicts = [dict_0, dict_1, dict_2]
lineCount = 0
for line in voteFile:
lineSplit = line.split(":")
if len(lineSplit) == 2:
#Retrieves the dictionary you need by index
listDictIndex = lineCount%len(listDicts)
currentDict = listDicts[listDictIndex]
key = lineSplit[0].strip()
value = lineSplit[1].strip()
currentDict[key] = value
lineCount += 1
voteFile.close()
import pprint
pprint.pprint(listDicts)
编辑该算法没有嵌套循环,因此您的运行时间为O(n),其中n是输入文档中的每一行。