对于我的程序,我有一个将字符串更改为列表的函数,但是当它遇到换行符时,它会组合换行符两边的两个单词。示例:
"newline\n problem"
在main函数中打印出这样的内容:
print(serperate_words)
newlineproblem
以下是代码:
def stringtolist(lines):
# string of acceptable characters
acceptable = "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'’- "
new_string = ''
for i in lines:
# runs through the string and checks to see what characters are in the string
if i in acceptable:
i = i.lower()
# if it is an acceptable character it is added to new string
new_string += i
elif i == '.""':
# if it is a period or quotation marks it is replaced with a space in the new string
new_string += ' '
else:
# for every other character it is removed and not added to new string
new_string += ''
#splits the string into a list
seperate_words = new_string.split(' ')
return seperate_words
答案 0 :(得分:1)
您可以拆分包含多个分隔符的字符串:
def stringtolist(the_string):
import re
return re.split('[ \.\n]', the_string)
如果需要,您可以在列表中添加其他分隔符(如引号,...)=> re.split('[ \.\n\'\"]', the_string)
答案 1 :(得分:0)
您可以检查换行符并跳过它。这是一个例子。
for word in string:
if ch is not '/n':
newstring += ch
或使用
.strip() to remove newlines altogether
答案 2 :(得分:0)
由于原始代码的注释中描述了多个转换,更灵活的方法可能是使用translate()
字符串方法(以及maketrans()
函数):
def stringtolist(lines):
import string
acceptable_chars = string.ascii_letters + string.digits + "'`- "
space_chars = '."'
delete_chars = ''.join(set(map(chr, xrange(256))) - set(acceptable_chars))
table = string.maketrans(acceptable + space_chars, acceptable.lower() + (' '*len(space_chars)))
return lines.translate(table, delete_chars).split()