Question

对于我的程序，我有一个将字符串更改为列表的函数，但是当它遇到换行符时，它会组合换行符两边的两个单词。示例：

"newline\n   problem"

在main函数中打印出这样的内容：

print(serperate_words)
newlineproblem

以下是代码：

def stringtolist(lines):
    # string of acceptable characters
    acceptable = "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'’- " 
    new_string = ''
    for i in lines:
        # runs through the string and checks to see what characters are in the string
        if i in acceptable:
            i = i.lower()
            # if it is an acceptable character it is added to new string
            new_string += i
        elif i == '.""':
            # if it is a period or quotation marks it is replaced with a space in the new string
            new_string += ' '
        else:
            # for every other character it is removed and not added to new string
            new_string += ''


    #splits the string into a list
    seperate_words = new_string.split(' ')
    return seperate_words

Answer 1

您可以拆分包含多个分隔符的字符串：

def stringtolist(the_string):
    import re
    return re.split('[ \.\n]', the_string)

如果需要，您可以在列表中添加其他分隔符（如引号，...）=＆gt; re.split('[ \.\n\'\"]', the_string)

Answer 2

您可以检查换行符并跳过它。这是一个例子。

for word in string:
    if ch is not '/n':
        newstring += ch

或使用

.strip() to remove newlines altogether

Answer 3

由于原始代码的注释中描述了多个转换，更灵活的方法可能是使用translate()字符串方法（以及maketrans()函数）：

def stringtolist(lines):
    import string
    acceptable_chars = string.ascii_letters + string.digits + "'`- "
    space_chars = '."'
    delete_chars = ''.join(set(map(chr, xrange(256))) - set(acceptable_chars))
    table = string.maketrans(acceptable + space_chars, acceptable.lower() + (' '*len(space_chars)))
    return lines.translate(table, delete_chars).split()

在python中将字符串放入列表中

3 个答案: