Question

我希望找出使用蒙特卡罗模拟进行参数组合的可能性。我有4个参数，每个参数可以有250个左右。我使用一些概率分布函数为每个参数随机生成了250,000个场景。我现在想知道哪些参数组合最有可能发生。为了实现这一点，我首先从我的250,000个随机生成的样本中过滤掉任何重复项，以减少列表的长度。然后我遍历这个简化列表并检查每个场景在原始250,000长列表中出现的次数。

我有250,000个项目的大型列表，其中包含列表，如下：

a = [[1,2,5,8],[1,2,5,8],[3,4,5,6],[3,4,5,7],....,[3,4,5,7]]# len(a) is equal to 250,000

我想找到一种快速有效的方法，让我的列表中的每个列表只出现一次。

最终目标是计算列表a中每个列表的出现次数。

到目前为止，我已经得到了：

'''Removing duplicates from list a and storing this as a new list temp'''
b_set = set(tuple(x) for x in a)
temp = [ list(x) for x in b_set ]
temp.sort(key = lambda x: a.index(x) )    

''' I then iterate through each of my possible lists (i.e. temp) and count how many times they occur in a'''
most_likely_dict = {}
for scenario in temp:
    freq = list(scenario_list).count(scenario)
    most_likely_dict[str(scenario)] = freq

目前需要15分钟的时间才能完成...有关如何将其转变为几秒钟的任何建议将非常感谢!!

Answer 1

您可以取出排序部分，因为最终结果是一个在任何情况下都是无序的字典，然后使用字典理解：

>>> a = [[1,2],[1,2],[3,4,5],[3,4,5], [3,4,5]]
>>> a_tupled = [tuple(i) for i in a]
>>> b_set = set(a_tupled)
>>> {repr(i): a_tupled.count(i) for i in b_set}
{'(1, 2)': 2, '(3, 4, 5)': 3}

在元组上调用list会增加更多开销，但如果你想

，你可以

>>> {repr(list(i)): a_tupled.count(i) for i in b_set}
{'[3, 4, 5]': 3, '[1, 2]': 2}

或者只使用Counter：

>>> from collections import Counter
>>> Counter(tuple(i) for i in a)

Answer 2

{str(item):a.count(item) for item in a}

输入：

a = [[1,2,5,8],[1,2,5,8],[3,4,5,6],[3,4,5,7],[3,4,5,7]]

输出：

{'[3, 4, 5, 6]': 1, '[1, 2, 5, 8]': 2, '[3, 4, 5, 7]': 2}

快速排序大型嵌套列表

2 个答案: