我想编写一些代码来获取项目列表并将它们连接起来(用逗号分隔)到长字符串,其中每个字符串不长于预定义的长度。 例如,对于此列表:
colors = ['blue','pink','yellow']
和最多10个字符,代码的输出将是:
长串0:蓝色,粉红色
长字符串1:黄色
我创建了下面的代码(下面),但它的缺陷是连接项的总长度小于允许的最大len,或者它创建一个或多个长字符串的位置以及连接的总len的情况。列表中的剩余项目短于最大len。
我想问的是:在下面的代码中,当项目用完但串联太短以至于未达到“else”子句时,你将如何“停止”循环?
非常感谢:)
import pyperclip
# Theoretical bug: when a single item is longer than max_length. Will never happen for the intended use of this code.
raw_list = pyperclip.paste()
split_list = raw_list.split()
unique_items_list = list(set(split_list)) # notice that set are unordered collections, and the original order is not maintained. Not crucial for the purpose of this code the way it is now, but good remembering. See more: http://stackoverflow.com/a/7961390/2594546
print "There are %d items in the list." % len(split_list)
print "There are %d unique items in the list." % len(unique_items_list)
max_length = 10 # salesforce's filters allow up to 1000 chars, but didn't want to hard code it in the rest of the code, just in case.
list_of_long_strs = []
short_list = [] # will hold the items that the max_length chars long str.
total_len = 0
items_processed = [] # will be used for sanity checking
for i in unique_items_list:
if total_len + len(i) + 1 <= max_length: # +1 is for the length of the comma
short_list.append(i)
total_len += len(i) + 1
items_processed.append(i)
elif total_len + len(i) <= max_length: # if there's no place for another item+comma, it means we're nearing the end of the max_length chars mark. Maybe we can fit just the item without the unneeded comma.
short_list.append(i)
total_len += len(i) # should I end the loop here somehow?
items_processed.append(i)
else:
long_str = ",".join(short_list)
if long_str[-1] == ",": # appending the long_str to the list of long strings, while making sure the item can't end with a "," which can affect Salesforce filters.
list_of_long_strs.append(long_str[:-1])
else:
list_of_long_strs.append(long_str)
del short_list[:] # in order to empty the list.
total_len = 0
unique_items_proccessed = list(set(items_processed))
print "Number of items concatenated:", len(unique_items_proccessed)
def sanity_check():
if len(unique_items_list) == len(unique_items_proccessed):
print "All items concatenated"
else: # the only other option is that len(unique_items_list) > len(unique_items_proccessed)
print "The following items weren't concatenated:"
print ",".join(list(set(unique_items_list)-set(unique_items_proccessed)))
sanity_check()
print ",".join(short_list) # for when the loop doesn't end the way it should since < max_length. NEED TO FIND A BETTER WAY TO HANDLE THAT
for item in list_of_long_strs:
print "Long String %d:" % list_of_long_strs.index(item)
print item
print
答案 0 :(得分:0)
目前,您在i
案例中对else
不执行任何操作,因此错过项目,如果未填写则不处理short_list
通过循环中的最后一项。
最简单的解决方案是使用short_list
重新启动i
:
short_list = [i]
total_len = 0
并在for
循环之后检查short_list
中是否还有任何内容,如果是,则处理它:
if short_list:
list_of_long_strs.append(",".join(short_list))
您可以简化if
检查:
new_len = total_len + len(i)
if new_len < max_length:
...
elif new_len == max_length:
...
else:
...
摆脱if
/ else
阻止开始:
if long_str[-1] == ",":
(",".join(...)
表示永远不会发生)
并使用enumerate
整理代码的最后一部分(我会切换到str.format
):
for index, item in enumerate(list_of_long_strs):
print "Long string {0}:".format(index)
print item
更广泛地说,这就是我要做的事情:
def process(unique_items_list, max_length=10):
"""Process the list into comma-separated strings with maximum length."""
output = []
working = []
for item in unique_items_list:
new_len = sum(map(len, working)) + len(working) + len(item)
# ^ items ^ commas ^ new item?
if new_len <= max_length:
working.append(item)
else:
output.append(working)
working = [item]
output.append(working)
return [",".join(sublist) for sublist in output if sublist]
def print_out(str_list):
"""Print out a list of strings with their indices."""
for index, item in enumerate(str_list):
print("Long string {0}:".format(index))
print(item)
演示:
>>> print_out(process(["ab", "cd", "ef", "gh", "ij", "kl", "mn"]))
Long string 0:
ab,cd,ef
Long string 1:
gh,ij,kl
Long string 2:
mn
答案 1 :(得分:0)
好的,我的OP中描述的问题的解决方案实际上非常简单,包含2个修改:
第一个 - else子句:
else:
long_str = ",".join(short_list)
list_of_long_strs.append(long_str)
items_processed.extend(short_list) #for sanity checking
del short_list[:] # in order to empty the list.
short_list.append(i) # so we won't lose this particular item
total_len = len(i)
这里的主要问题是在删除short_list后追加i,因此循环转到else子句的项目不会丢失。同样,total_len被设置为此项目的len,而不是之前的0。
正如上面的友好评论者所建议的那样,if-else在其他地方是多余的,所以我把它拿出来了。
第二部分:
residual_items_concatenated = ",".join(short_list)
list_of_long_strs.append(residual_items_concatenated)
这部分确保当short_list没有&#34;使它成为&#34;因为total_len&lt;来自else子句max_length,它的项目仍然连接在一起,并作为另一个项目添加到长字符串列表中,就像以前的朋友一样。
我觉得这两个小修改是我问题的最佳解决方案,因为它保留了大部分代码,只是改变了几行而不是从sratch重写。