Question

我有两个相等长度的列表，一个是数据系列，另一个是时间序列。它们代表随时间测量的模拟值。

我想创建一个从两个列表中随机删除设定百分比或分数的函数。即如果我的分数是0.2，我想从两个列表中随机删除20％的项目，但它们必须是相同的项目（每个列表中的相同索引）被删除。

例如，设n = 0.2（要删除20％）

a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

随机删除20％后，它们变为

a_new = [0,1,3,4,5,6,8,9]
b_new = [0,1,9,16,25,36,64,81]

这种关系并不像示例那么简单，所以我不能只在一个列表上执行此操作，然后计算出第二个;它们已经存在为两个列表。他们必须保持原来的顺序。

谢谢！

Answer 1

import random

a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2  # how much of a/b do you want to exclude

# generate a list of indices to exclude. Turn in into a set for O(1) lookup time
inds = set(random.sample(list(range(len(a))), int(frac*len(a))))

# use `enumerate` to get list indices as well as elements. 
# Filter by index, but take only the elements
new_a = [n for i,n in enumerate(a) if i not in inds]
new_b = [n for i,n in enumerate(b) if i not in inds]

Answer 2

from random import randint as r

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

percentage = 0.3

g = (r(0, len(a)-1) for _ in xrange(int(len(a) * (1-percentage))))

c, d = [], []
for i in g:
    c.append(a[i])
    d.append(b[i])

a, b = c, d

print a
print b

Answer 3

如果a和b不是很大，您就可以使用zip：

import random

a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2  # how much of a/b do you want to exclude
ab = list(zip(a,b))  # a list of tuples where the first element is from `a` and the second is from `b`

new_ab = random.sample(ab, int(len(a)*(1-frac)))  # sample those tuples
new_a, new_b = zip(*new_ab)  # unzip the tuples to get `a` and `b` back

请注意，这不会保留a和b

的原始顺序

Answer 4

您还可以操作压缩 a和b序列，获取索引的随机样本（以维护项目的原始顺序）并将解压缩转换为{{1再次和a_new：

b_new

可以打印：

import random


a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]

frac = 0.2

c = zip(a, b)  # c = list(zip(a, b)) on Python 3
indices = random.sample(range(len(c)), frac * len(c))
a_new, b_new = zip(*sorted(c[i] for i in sorted(indices)))

print(a_new)
print(b_new)

Answer 5

<canvas width="500" height="300" style="border: 1px solid"></canvas>

Answer 6

l = len(a)
n_drop = int(l * n)
n_keep = l - n_drop
ind = [1] * n_keep + [0] * n_drop
random.shuffle(ind)
new_a = [ e for e, i in zip(a, ind) if i ]
new_b = [ e for e, i in zip(b, ind) if i ]

如何从列表中随机删除一定百分比的项目

6 个答案: