我从CSV文件中提取了数千个ID(现在它是ID的生成器)来迭代和处理这些ID。
为优化代码,我已将这些ID分组并一次处理整批。
以下代码分区 - 以n。
的批量大小来处理生成器from itertools import zip_longest
def grouper(n, iterable):
""" Grouping of iterable with n objects
Attributes
:n No. of values in a group
:iterable/string to be iterated
:return group of string/iterator values
"grouper(3, 'abcdefg') --> ('a','b','c'), ('d','e','f'), ('g',None, None)"
"""
return zip_longest(*[iter(iterable)]*n)
例如:
>>>acc_ids = ['ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47', 'ID54', 'ID58']
#--As an iterator
>>>id_generator = (i for i in acc_ids)
>>>batches = grouper(7, id_generator)
>>>batches
<itertools.zip_longest object at 0x7f3beb3313b8>
#This iterator is much similar to the below list and notice padded `None`(s) at the end of last batch:
#[('ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47'), ('ID54', 'ID58', None, None, None, None, None)]
问题是,要从迭代器中删除填充的None
值,我正在使用filter
for batch in batches:
batch = list(filter(None, batch))
此过滤器正在从列表中删除None
值。因为我在考虑,而不是添加额外的过滤器,我们可以防止在分割生成器时产生填充的None
值...
查询:
grouper
来抑制生成填充的无值吗?答案 0 :(得分:3)
这可能对您有用:
def grouper(n, iterable):
iter_ = iter(iterbale)
while True:
res = tuple(next(iter_) for _ in range(n))
if not res:
return
yield res
acc_ids = ['ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47', 'ID54', 'ID58']
id_generator = iter(acc_ids)
batches = grouper(7, id_generator)
print(list(batches))
输出:
[('ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47'), ('ID54', 'ID58')]
答案 1 :(得分:1)
一种可能性是使用已包含此类功能的外部库:
def create
@gigdates = params[:gig][:date].split(';')
@gigdates.each do |date|
@gig = Gig.new(gig_params)
@gig.date = date
@genres = Genre.where(:id => params[:choose_genres])
@gig.genres << @genres
@gig.save
end
redirect_to @gig
end
>>> from iteration_utilities import grouper
>>> list(grouper(acc_ids, 7))
[('ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47'), ('ID54', 'ID58')]
pytoolz.partition_all
或>>> from more_itertools import chunked
>>> list(chunked(acc_ids, 7))
[['ID21', 'ID24', 'ID38', 'ID40', 'ID42', 'ID43', 'ID47'], ['ID54', 'ID58']]
:
cytoolz.partition_all
这些库都有自由许可证(Apache,MIT和BSD),所以即使您不想要依赖关系,也可能只是重用它们的代码(您可能需要在代码中包含它们的许可证,请参阅其许可证以进一步查看详情)。