在python中,我有以下形式的三元组列表:
[(220, 1.0, 1), (385, 1.0, 2), (405, 1.0, 2), (1276, 1.0, 6), (1649, 1.0, 8), (1941, 1.0, 10), (2554, 1.0, 13), (3123, 1.0, 16), (2377, 0.8879465659, 12), (500, 0.8854919047, 2), (2435, 0.8815715038, 12), (2151, 0.8787807797, 11), (1888, 0.87827976, 9), (2185, 0.8780501222, 11), (2215, 0.8747450062, 11), (358, 0.8724861947, 2), (3636, 0.8716343914, 19), (734, 0.8714647102, 3), (1742, 0.8707242976, 9), …………]
我想将列表中的元素按照列表中每个三元组的第二个值,分成十个轨道:
<块引用>=0 和 <0.1
<块引用>=0.1 和 <0.2
<块引用>=0.2 和 <0.3
<块引用>=0.3 和 <0.4
<块引用>= 0.4 和 <0.5
.....
<块引用>=0.8 和 <0.9
<块引用>=0.9 和 <=1.0
我尝试使用:
tracks = np.linspace(0,1,11) # array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
for i in range (len(tracks)-1):
print( tracks[i], "a", tracks[i+1], ":", sum([1 for (x,y,z) in sm_list if y >= tracks[i]] and y < intervalos[i+1]))
但它返回以下错误:“NameError: name 'y' is not defined”
如何定义复合布尔子句以形成上述列表?
制作:
tracks = np.linspace(0,1,11) # array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
for i in range (len(tracks)-1):
print( tracks[i], "a", tracks[i+1], ":", sum([1 for (x,y,z) in sm_list if y >= tracks[i]] ))
不返回错误,但返回重叠的轨道,这不是我需要的。
答案 0 :(得分:0)
tracks = np.linspace(0,1,11) # array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
for i in range (len(tracks)-1):
print( tracks[i], "a", tracks[i+1], ":", sum([1 for (x,y,z) in sm_list if y >= tracks[i] and y < tracks[i+1]]))
tracks[-1]
,等于 1.0
,所以 y < 1.0
不会捕获那些是 ==1
的值。快速且脏修复:在进入循环之前设置 tracks[-1] = 1.01
。for
循环,但更喜欢 numpy
方法,带有一点 pandas
:(我鼓励您运行在您的数据上逐行查看输出;也许您会意识到您只需要前几行,这取决于您的真正目的)numbers = [y for (x,y,z) in sm_list] # we care only about the second number in each triple anyway
x = (np.array(numbers) * 10).astype(np.int) # multiply by 10 and "floor" into ints in the range 0,1,2,...,10 (e.g. 0-0.9999 will be mapped to 0, 1.0-1.9999 to 1, 9.0-9.9999 to 9, and 1.0 to 10)
# Now we have a list of integers between 0 and 10, and we remain with the task of getting their histogram or counts.
# There are plenty of ways to solve this, e.g. using collections.Counter(x), but we'll use a fancier way:
s = pd.value_counts(x) # already may be suffice for your needs
s = s.reindex(range(0,11), fill_value=0) # try this to ensure that all values 0,1,2,...,10 have some count associated with them (namely 0) even if they did not exist in the input
if False: # bonus: if you don't like the cell s[10] and want its value to be added into s[9], then change this 'if False' to 'if True'...
s[9] += s[10];
s.drop(10, inplace=True)