如何拆分字符串,使其包含少于n个字符的扭曲

时间:2017-10-29 14:17:12

标签: python split

我有一个长字符串,我想保存到文件中。单词用空格分隔。长字符串中的单词总和可以被3整除。

基本上我正在寻找一种将字符串拆分成块的方法。每个块少于n个字符,块中的字数也可以被3整除。

e.g。

>>> longstring = "This is a very long string and the sum of words is divisible by three"
>>> len(longstring.split())
>>> 15

说最大线长是n = 30:

>>>split_string(longstring, 30)
['This is a very long string', 'and the sum of words is', 'divisible by three']

总结:

  1. 规则是:没有超过n个字符的行。
  2. 扭曲是每个新行必须包含多个3个单词。
  3. 到目前为止,我尝试使用textwrap,但我不知道如何实现2。

    import textwrap    
    textwrap.fill(long_line, width=69)
    

1 个答案:

答案 0 :(得分:1)

如果您确定字符串中的单词总数始终可以被3整除,则可以执行以下操作:

import sys
#long string; 84 words; divisible by 3
longString = "The charges are still sealed under orders from a federal judge. Plans were prepared Friday for anyone charged to be into custody as soon as Monday, the sources said. It is unclear what the charges are. A spokesman for the special counsel's office declined to comment. The White House also had no comment, a senior administration official said Saturday morning. A spokesman for the special counsel's office declined to comment. The White House also had no comment, a senior administration official said Saturday morning."
#convert string to list
listOfWords = longString.split()
#list to contain lines
lines = []
#make sure number of words is divisible by 3
if  len(listOfWords) % 3 != 0:
    #exit
    print "words number is not divisible by 3"
    sys.exit()
#keep going until list is empty
while listOfWords:

    i = 0
    line = ""
    #loop for every line
    while True:
        #puts the next 3 words into a string
        temp = " ".join(listOfWords[i:i+3])
        #check new length of line after adding the new 3 words, if it is still less than 70, add the words, otherwise break out of the loop
        if len(line) + len(temp) > 70:
            break
        line += "{} ".format(temp)
        i+=3
    #remove finished words from the list completely
    listOfWords = listOfWords[i:]
    #adds line into result list
    lines.append(line.strip())

#to make sure this works
for line in lines:
    print len(str(line))
    print "Number of words: {}".format(len(line.split()))
    print "number of chars: {}".format(len(line))
    print line
    print "----------------------------------------"