所以我正在处理一个未知长度的列表。我需要把这个列表分成四部分。
第一部分=列表的前20%
第二部分=从列表的20%到40%
第三部分=从名单的40%到80%
第四部分=从列表的80%到100%。
现在问题是,如果列表少于10个元素,我的一些列表将为空。我的问题是如何避免这个问题。
这是我现在的脚本:
x = ["one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"]
twentyPercentOne = len(x) * 0.2
twentyPercentTwo = len(x) * 0.4
fourtyPercentThree = len(x) * 0.8
i = 0
j = 2
m = []
while j < (twentyPercentOne + 1):
m.append(x[i:j])
i = (i + 2)
j = (j + 2)
h = []
while j < (twentyPercentTwo + 1):
h.append(x[i:j])
i = (i + 2)
j = (j + 2)
l = []
while j < (fourtyPercentThree + 1):
l.append(x[i:j])
i = (i + 2)
j = (j + 2)
t = x[i:len(x)]
输出:
[['one', 'two']]
[['three', 'four']]
[['five', 'six'], ['seven', 'eight']]
['nine', 'ten']
如果列表长度小于10,则输出: x = [“一”,“二”,“三”,“四”,“五”,“六”,“七”]
[['one', 'two']]
[]
[['three', 'four'], ['five', 'six']]
['seven']
有人知道怎么做吗?我知道它更多的是数学问题,而不是python问题,但我不知道该怎么做,并且已经工作了好几天。我将不胜感激任何帮助。
由于
答案 0 :(得分:7)
对于任意大小的任意数量的分裂(不只是四个),这应该是正确的方法(只要它们加起来为1):
def percentage_split(seq, percentages):
assert sum(percentages) == 1.0
prv = 0
size = len(seq)
cum_percentage = 0
for p in percentages:
cum_percentage += p
nxt = int(cum_percentage * size)
yield seq[prv:nxt]
prv = nxt
(这是一个生成器函数,你可以得到这样的四分位数列表:
list(percentage_split(x, [0.25]*4))
)
如果你安装了numpy,它可能会有点麻烦:
from numpy import cumsum
def percentage_split(seq, percentages):
cdf = cumsum(percentages)
assert cdf[-1] == 1.0
stops = map(int, cdf * len(seq))
return [seq[a:b] for a, b in zip([0]+stops, stops)]
如果你只想要四个相等的四分位数......
numpy.split(seq, 4)
答案 1 :(得分:0)
您应该清楚,不可能以匹配的长度将列表分割。但这是另一种方式:
def do_split(x, percent):
L = len(x)
idx1 = [0] + list(int(L * p) for p in percent[:-1])
idx2 = idx1[1:] + [L]
return list(x[i1:i2] for i1,i2 in zip(idx1, idx2))
splits = [0.2, 0.4, 0.8, 1.0]
print do_split(["one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"], splits)
# ---> [['one', 'two'], ['three', 'four'], ['five', 'six', 'seven', 'eight'], ['nine', 'ten']]
print do_split( ["one", "two", "three", "four", "five", "six", "seven"], splits)
# --> [['one'], ['two'], ['three', 'four', 'five'], ['six', 'seven']]