我有两个相等长度的列表,一个是数据系列,另一个是时间序列。它们代表随时间测量的模拟值。
我想创建一个从两个列表中随机删除设定百分比或分数的函数。即如果我的分数是0.2,我想从两个列表中随机删除20%的项目,但它们必须是相同的项目(每个列表中的相同索引)被删除。
例如,设n = 0.2(要删除20%)
a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]
随机删除20%后,它们变为
a_new = [0,1,3,4,5,6,8,9]
b_new = [0,1,9,16,25,36,64,81]
这种关系并不像示例那么简单,所以我不能只在一个列表上执行此操作,然后计算出第二个;它们已经存在为两个列表。他们必须保持原来的顺序。
谢谢!
答案 0 :(得分:7)
import random
a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]
frac = 0.2 # how much of a/b do you want to exclude
# generate a list of indices to exclude. Turn in into a set for O(1) lookup time
inds = set(random.sample(list(range(len(a))), int(frac*len(a))))
# use `enumerate` to get list indices as well as elements.
# Filter by index, but take only the elements
new_a = [n for i,n in enumerate(a) if i not in inds]
new_b = [n for i,n in enumerate(b) if i not in inds]
答案 1 :(得分:1)
from random import randint as r
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
percentage = 0.3
g = (r(0, len(a)-1) for _ in xrange(int(len(a) * (1-percentage))))
c, d = [], []
for i in g:
c.append(a[i])
d.append(b[i])
a, b = c, d
print a
print b
答案 2 :(得分:0)
如果a
和b
不是很大,您就可以使用zip
:
import random
a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]
frac = 0.2 # how much of a/b do you want to exclude
ab = list(zip(a,b)) # a list of tuples where the first element is from `a` and the second is from `b`
new_ab = random.sample(ab, int(len(a)*(1-frac))) # sample those tuples
new_a, new_b = zip(*new_ab) # unzip the tuples to get `a` and `b` back
请注意,这不会保留a
和b
答案 3 :(得分:0)
您还可以操作压缩 a和b序列,获取索引的随机样本(以维护项目的原始顺序)并将解压缩转换为{{1再次和a_new
:
b_new
可以打印:
import random
a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]
frac = 0.2
c = zip(a, b) # c = list(zip(a, b)) on Python 3
indices = random.sample(range(len(c)), frac * len(c))
a_new, b_new = zip(*sorted(c[i] for i in sorted(indices)))
print(a_new)
print(b_new)
答案 4 :(得分:0)
<canvas width="500" height="300" style="border: 1px solid"></canvas>
答案 5 :(得分:0)
l = len(a)
n_drop = int(l * n)
n_keep = l - n_drop
ind = [1] * n_keep + [0] * n_drop
random.shuffle(ind)
new_a = [ e for e, i in zip(a, ind) if i ]
new_b = [ e for e, i in zip(b, ind) if i ]