我通过一个非常大的字典列表进行过滤。 kept
是全局列表,其中包含大约9000个词典,所有词典都具有相同的键。我试图删除所有包含“M_P”字典的词典。值大于-4.5并且有超过一半的值因此我创建了一个仅用于其目的的函数。当我检查它们是否已在后来的功能中被删除时,仍然剩下~3000。任何人都可以告诉我为什么会发生这种情况,我能相信这些功能会做我告诉它的事情吗?
def removeMag():
countMag = 0
for mag in kept:
if to_float(mag['M_P']) > -4.5:
kept.remove(mag)
countMag += 1
else:
continue
print '\n'
print ' Number of mags > -4.5 actually removed: '
print countMag
def remove_anomalies():
count = 0
count08 = 0
count09 = 0
count01 = 0
countMag = 0
countMagDim = 0
#Want to remove Q* < 15 degrees
for row in kept:
#to_float(kept(row))
#Q* greater than 15
if to_float(row['Q*']) < 15.00:
kept.remove(row)
elif to_float(row['vel']) > 80.00:
kept.remove(row)
elif to_float(row['err']) >= 0.5*to_float(row['vel']):
kept.remove(row)
elif row['log_10_m'] == '?':
kept.remove(row)
#print row
count+=1
elif row['M_P'] == '?':
kept.remove(row)
countMag += 1
elif to_float(row['M_P']) > -4.5:
countMagDim += 1
就在这里,我正在检查它。 ^^^
elif to_float(row['T_j']) < -50.00 or to_float(row['T_j'] > 50.00):
kept.remove(row)
count01 += 1
#make sure beg height is above end height.
elif to_float(row['H_beg']) < to_float(row['H_end']):
kept.remove(row)
#make sure zenith distance is not greater than 90
elif to_float(row['eta_p']) > 90.00:
kept.remove(row)
#Remove extremities hyperbolic orbits
elif (to_float(row['e']) > 2.00 and to_float(row['e']) == 0.00 and to_float(row['a']) == 0.00 and to_float(row['incl']) == 0.00 and to_float(row['omega']) == 0.00 and to_float(row['anode']) == 0.00 and to_float(row['alp_g']) == 0.00 and to_float(row['del_g']) == 0.00 and to_float(row['lam_g']) == 0.00 and to_float(row['bet_g']) == 0.00):
kept.remove(row)
count08+=1
elif to_float(row['q_per']) == 0.00:
kept.remove(row)
count09+=1
elif to_float(row['q_aph']) == 0.00:
kept.remove(row)
count09+=1
else: continue
print 'Number of dicts with ? as mass value:'
print count
print " Number removed with orbital elements condition: "
print count08
print "Number of per or aph equal to 0: "
print count09
print "Number of T_j anomalies: "
print count01
print "Number of Magnitudes removed from '?': "
print countMag
以下输出类似于3000。
print "Number of Magnitudes to be removed from too dim: "
print countMagDim
'''
print "\n"
print "log mass values:"
for row2 in kept:
print row2['log_10_mass']
print "\n"
'''
答案 0 :(得分:5)
使用for循环进行迭代时,Python不会自动复制列表,而是直接迭代它。因此,当您删除元素时,循环不会考虑更改,并将跳过列表的元素。
示例:
>>> l = [1,2,3,4,5]
>>> for i in l: l.remove(i)
>>> l
[2, 4]
您可以使用list indice作为简写,在迭代之前复制列表,例如:
>>> for i in l[:]: l.remove(i)
>>> l
[]
答案 1 :(得分:2)
正如其他人所说,你在迭代它时修改一个数组。
这个简单的单行将是
kept = [mag for mag in kept if to_float(mag['M_P']) <= -4.5]
其中只保留您感兴趣的所有条目,替换原始列表。
计算删除的数量只是在理解之前和之后采用len(kept)
并采取差异。
可替换地,
discarded = [mag for mag in kept if to_float(mag['M_P']) > -4.5]
kept = [mag for mag in kept if to_float(mag['M_P']) <= -4.5]
拆分数组而不会丢失任何信息
答案 2 :(得分:1)
你永远不应该修改你在for循环中迭代的序列。只看你的第一个功能:
def removeMag():
countMag = 0
for mag in kept:
if to_float(mag['M_P']) > -4.5:
kept.remove(mag)
countMag += 1
您正在循环中remove
上呼叫kept
。这会导致未指定的行为,任何事情都可能发生。请参阅this question。
解决此问题的一种简单方法是使用新列表来保留项目:
mag_to_keep = []
for mag in kept:
if float(mag['M_P']) <= -4.5:
mag_to_keep.append(mag)
kept = mag_to_keep