Question

有没有办法随机洗牌哪些键对应哪些值？我找到了random.sample，但我想知道是否有更多的pythonic /更快的方法。

示例：a = {"one":1,"two":2,"three":3}

洗牌：a_shuffled = {"one":2,"two":3,"three":1}

Answer 1

In [47]: import random

In [48]: keys = a.keys()

In [49]: values = a.values()

In [50]: random.shuffle(values)

In [51]: a_shuffled = dict(zip(keys, values))

In [52]: a_shuffled
Out[52]: {'one': 2, 'three': 1, 'two': 3}

或者，更精辟的是：

In [56]: dict(zip(a.keys(), random.sample(a.values(), len(a))))
Out[56]: {'one': 3, 'three': 2, 'two': 1}

（但我想这是你已经提出的解决方案。）

请注意，虽然使用random.sample比较简单，但使用random.shuffle会更快一些：

import random
import string
def using_shuffle(a):
    keys = a.keys()
    values = a.values()
    random.shuffle(values)
    return dict(zip(keys, values))

def using_sample(a):
    return dict(zip(a.keys(), random.sample(a.values(), len(a))))

N = 10000
keys = [''.join(random.choice(string.letters) for j in range(4)) for i in xrange(N)]
a = dict(zip(keys, range(N)))

In [71]: %timeit using_shuffle(a)
100 loops, best of 3: 5.14 ms per loop

In [72]: %timeit using_sample(a)
100 loops, best of 3: 5.78 ms per loop

Answer 2

抱歉，让它更快的唯一方法是使用numpy：/。无论你做什么，它都必须以某种方式扰乱所有需要时间的指数 - 所以在C中这样做会有所帮助。同样随机和随机的区别在于你不能重复索引。

抱歉，现在有点长 - 所以你必须做一些滚动

E.g.                                                                                                                                              


# made for python 2.7 but should be able to work in python 3
import random
import numpy as np
from time import time


def given_seq():
#general example
    start = time()
    a = {"one":1,"two":2,"three":3}
    keys = a.keys()
    random.shuffle(keys)
    a = dict(zip(keys, a.values()))

#Large example

a = dict(zip(range(0,100000), range(1,100001)))

def random_shuffle():
    keys = a.keys()
    random.shuffle(keys)
    b = dict(zip(keys, a.values()))

def np_random_shuffle():
    keys = a.keys()
    np.random.shuffle(keys)
    b = dict(zip(keys, a.values()))

def np_random_permutation():
    #more concise and using numpy's permutation option
    b = dict(zip(np.random.permutation(a.keys()), a.values()))

#if you precompute the array key as a numpy array

def np_random_keys_choice():
    akeys = np.array(a.keys())
    return dict(zip(akeys[np.random.permutation(len(akeys))],a.values()))

def np_random_keys_shuffle():
    key_indexes = np.arange(len(a.keys()))
    np.random.shuffle(key_indexes)
    return dict(zip(np.array(a.keys())[key_indexes],a.values()))

#fixed dictionary size
key_indexes = np.arange(len(a.keys()))
def np_random_fixed_keys_shuffle():
    np.random.shuffle(key_indexes)
    return dict(zip(np.array(a.keys())[key_indexes],a.values()))


#so dstack actually slows things down
def np_random_shuffle_dstack():
    keys = a.keys()
    np.random.shuffle(keys)
    return dict(np.dstack((keys, a.values()))[0])

if __name__=='__main__':
    import timeit
    # i can use global namespace level introspection to automate the below line but it's not needed yet
    for func in ['given_seq', 'random_shuffle', 'np_random_shuffle', 'np_random_permutation', 'np_random_keys_choice',
            'np_random_keys_shuffle', 'np_random_fixed_keys_shuffle']:
        print func, timeit.timeit("{}()".format(func), setup = "from __main__ import {}".format(''.join(func)), number = 200)

given_seq 0.00103783607483
random_shuffle 23.869166851
np_random_shuffle 16.3060112
np_random_permutation 21.9921720028
np_random_keys_choice 21.8105020523
np_random_keys_shuffle 22.4905178547
np_random_fixed_keys_shuffle 21.8256559372

使用选择/置换可能看起来更好 - 但它无论如何都不会更快。不幸的是，复制通常很慢，除非它是一个小尺寸 - 并且没有办法传递指针/引用而不必占用额外的一行 - 尽管我辩论这是否会使它'非pythonic'

即如果您在python会话中查看Zen of Python或只执行import this，其中一行是：

虽然实用性胜过纯洁。

所以它可以解释当然：）

随机随机播放Python DIctionary中的键和值

2 个答案: