我正在从元素列表(类似于集合或集合分区的分区)生成分区列表。问题是我需要为这些分区中的每一个分配一个指示其值的随机数,以便稍后在包含partition = value对的输出数据上运行一些计算。
样本将是带有样本条目的csv,如下所示:
p,v
"[[1, 2, 3, 4]]",0.3999960625186746
"[[1], [2, 3, 4]]",0.49159520559753156
"[[1, 2], [3, 4]]",0.12658202037597555
"[[1, 3, 4], [2]]",0.11670775560336522
"[[1], [2], [3, 4]]",0.006059031164368345
以下是我为此编写的代码:
from collections import defaultdict
import random
import csv
partitions = []
elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)
# simple menu
if size == 1:
partitionlist = range(1,size+1)
print ('A one element list have 1 partition')
elif size < 28:
partitionlist = range(1,size+1)
elif size >= 28:
partitionlist = [0]
print ("Invalid number. Try again...")
# generate all partitions
def partition(elements):
if len(elements) == 1:
yield [ elements ]
return
first = elements[0]
for smaller in partition(elements[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
for p in partition(partitionlist):
partitions.append([sorted(p)] + [random.uniform(0,1)])
# write the generated input to CSV file
data = partitions
def partition_value_data(size):
with open( size+'-elem-normaldist.csv','w') as out:
csv_out=csv.writer(out)
csv_out.writerow(['p','v'])
for row in data:
csv_out.writerow(row)
partition_value_data(fileheader)
我面临的问题是,当元素数量超过13时,我会收到内存错误。是因为我的计算机内存还是Python本身的限制。我正在使用Python 2.7.12。
对于包含15个元素的列表,分区数约为。 1382958545
我正在尝试生成一个最多包含30个元素的列表的分区,其中分区的数量大约为。 545717047947902329359
非常感谢任何建议。谢谢。
答案 0 :(得分:1)
你的问题在于你将发电机组合成一个列表,完全否定创建发电机的任何好处。
相反,你应该直接从你的发电机写出来。
from collections import defaultdict
import random
import csv
elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)
# simple menu
if size == 1:
partitionlist = range(1,size+1)
print ('A one element list have 1 partition')
elif size < 28:
partitionlist = range(1,size+1)
elif size >= 28:
partitionlist = [0]
print ("Invalid number. Try again...")
# generate all partitions
def partition(elements):
if len(elements) == 1:
yield [ elements ]
return
first = elements[0]
for smaller in partition(elements[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
def partition_value_data(size):
with open( size+'-elem-normaldist.csv','w') as out:
csv_out=csv.writer(out)
csv_out.writerow(['p','v'])
for row in partition(partitionlist):
csv_out.writerow([sorted(row)] + [random.uniform(0,1)])
partition_value_data(fileheader)