使用:Python 2.4
目前,我有一个嵌套的for循环,迭代2个列表,并根据两个列表中存在的两个元素进行匹配。找到匹配后,它将从r120Final列表中获取元素,并放入一个名为“r120Delta”的新列表:
for r120item in r120Final:
for spectraItem in spectraFinal:
if(str(spectraItem[0]) == r120item[2].strip()) and (str(spectraItem[25]) == r120item[10]):
r120Delta.append(r120item)
break
问题在于这是如此缓慢而且列表并不那么深。 R120约为64,000行,Spectra约为150,000行。
r120Final列表是一个嵌套数组,它看起来像这样:
r120Final[0] = [['xxx','xxx','12345','xxx','xxx','xxx','xxx','xxx','xxx','xxx','234567']]
...
r120Final[n] = [['xxx','xxx','99999','xxx','xxx','xxx','xxx','xxx','xxx','xxx','678901']]
spectraFinal列表基本相同,是一个嵌套数组,它看起来像这样:
spectraFinal[0] = [['12345','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','234567']]
...
spectraFinal[0] = [['99999','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','678901']]
最后,“r120Delta”的原因是我可以在r120Final和r120Delta之间执行列表差异,并检索未匹配的r120数据元素。这是我为这个任务定义的函数,再次,慢:
def listDiff( diffList, completeList ):
returnList = []
for completeItem in completeList:
if not completeItem in diffList:
returnList.append(completeItem)
return returnList
基本上,我对Python知识渊博,但绝不是专家。我正在寻找一些专家来告诉我如何加快速度。任何帮助表示赞赏!
答案 0 :(得分:2)
spectra_set = set((str(spectraItem[0]), str(spectraItem[25])) for spectraItem in spectraFinal)
returnList = []
for r120item in r120Final:
if (r120item[2].strip(), r120item[10]) not in spectra_set:
returnList.append(r120item)
这会添加与returnList
不匹配的所有项目。
你可以在一行(如果你真的想要)这样做
returnList = [r120item for r120item in r120Final
if (r120item[2].strip(), r120item[10]) not in
set((str(spectraItem[0]), str(spectraItem[25]))
for spectraItem in spectraFinal)]
如果您需要完整的spectraItem
:
spectra_dict = dict(((str(spectraItem[0]), str(spectraItem[25])), spectraItem) for spectraItem in spectraFinal)
returnList = []
for r120item in r120Final:
key = (r120item[2].strip(), r120item[10])
if key not in spectra_dict:
returnList.append(r120item)
else:
return_item = some_function_of(r120item, spectra_dict[key])
returnList.append(return_item)