我正在尝试将可变长度字符串拆分为不同但预定义的行长度。当我把它变成Python Tutor时(我现在无法访问正确的python IDE),我把下面的一些代码放在一起,导致键错误6失败了。我想这意味着我的while循环不起作用正确,它试图继续增加lineNum,但我不太清楚为什么。有一个更好的方法吗?或者这很容易解决?
代码:
import re
#Dictionary containing the line number as key and the max line length
lineLengths = {
1:9,
2:11,
3:12,
4:14,
5:14
}
inputStr = "THIS IS A LONG DESC 7X7 NEEDS SPLITTING" #Test string, should be split on the spaces and around the "X"
splitted = re.split("(?:\s|((?<=\d)X(?=\d)))",inputStr) #splits inputStr on white space and where X is surrounded by numbers eg. dimensions
lineNum = 1 #initialises the line number at 1
lineStr1 = "" #initialises each line as a string
lineStr2 = ""
lineStr3 = ""
lineStr4 = ""
lineStr5 = ""
#Dictionary creating dynamic line variables
lineNumDict = {
1:lineStr1,
2:lineStr2,
3:lineStr3,
4:lineStr4,
5:lineStr5
}
if len(inputStr) > 40:
print "The short description is longer than 40 characters"
else:
while lineNum <= 5:
for word in splitted:
if word != None:
if len(lineNumDict[lineNum]+word) <= lineLengths[lineNum]:
lineNumDict[lineNum] += word
else:
lineNum += 1
else:
if len(lineNumDict[lineNum])+1 <= lineLengths[lineNum]:
lineNumDict[lineNum] += " "
else:
lineNum += 1
lineOut1 = lineStr1.strip()
lineOut2 = lineStr2.strip()
lineOut3 = lineStr3.strip()
lineOut4 = lineStr4.strip()
lineOut5 = lineStr5.strip()
我已经看过这个答案,但对C#没有任何真正的了解:Split large text string into variable length strings without breaking words and keeping linebreaks and spaces
答案 0 :(得分:1)
它不起作用,因为你的循环内的splitle 循环中的单词具有lineLen条件。你必须这样做:
if len(inputStr) > 40:
print "The short description is longer than 40 characters"
else:
for word in splitted:
if lineNum > 5:
break
if word != None:
if len(lineNumDict[lineNum]+word) <= lineLengths[lineNum]:
lineNumDict[lineNum] += word
else:
lineNum += 1
else:
if len(lineNumDict[lineNum])+1 <= lineLengths[lineNum]:
lineNumDict[lineNum] += " "
else:
lineNum += 1
也不会更改lineStr1,lineStr2等,你必须直接访问dict(字符串是不可变的)。我试了一下并得到了结果:
print("Lines: %s" % lineNumDict)
给出:
Lines: {1: 'THIS IS A', 2: 'LONG DESC 7', 3: '7 NEEDS ', 4: '', 5: ''}
答案 1 :(得分:0)
for word in splitted:
...
lineNum += 1
您的代码按lineNum
中的字数增加splitted
,即16次。
答案 2 :(得分:0)
我想知道正确评论的正则表达式是否更容易理解?
lineLengths = {1:9,2:11,3:12,4:14,5:14}
inputStr = "THIS IS A LONG DESC 7X7 NEEDS SPLITTING"
import re
pat = """
(?: # non-capture around the line as we want to drop leading spaces
\s* # drop leading spaces
(.{{1,{max_len}}}) # up to max_len characters, will be added through 'format'
(?=[\b\sX]|$) # and using word breaks, X and string ending as terminators
# but without capturing as we need X to go into the next match
)? # and ignoring missing matches if not all lines are necessary
"""
# build a pattern matching up to 5 lines with the corresponding max lengths
pattern = ''.join(pat.format(max_len=x) for x in lineLengths.values())
re.match(pattern, inputStr, re.VERBOSE).groups()
# Out: ('THIS IS A', 'LONG DESC 7', '7 NEEDS', 'SPLITTING', None)
另外,对于line_lengths使用dict没有任何意义,列表可以很好地完成。