在python中使用strip()

时间:2015-02-12 04:48:56

标签: python strip

编写带有上述字符串列表的函数list_of_words,并返回删除了所有空格和标点符号的单个单词列表(撇号/单引号除外)。

我的代码删除句点和空格,但不删除逗号或感叹号。

def list_of_words(list_str):
    m = []
    for i in list_str:
        i.strip('.')
        i.strip(',')
        i.strip('!')
        m = m+i.split()
    return m

print(list_of_words(["Four score and seven years ago, our fathers brought forth on",
  "this continent a new nation, conceived in liberty and dedicated",
  "to the proposition that all men are created equal.  Now we are",
  "   engaged in a great        civil war, testing whether that nation, or any",
  "nation so conceived and so dedicated, can long endure!"])

5 个答案:

答案 0 :(得分:2)

清除一些标点符号和多个空格的最简单方法之一就是使用re.sub函数。

import re

sentence_list = ["Four score and seven years ago, our fathers brought forth on",
                 "this continent a new nation, conceived in liberty and dedicated",
                 "to the proposition that all men are created equal.  Now we are",
                 "   engaged in a great        civil war, testing whether that nation, or any",
                 "nation so conceived and so dedicated, can long endure!"]

sentences = [re.sub('([,.!]){1,}', '', sentence).strip() for sentence in sentence_list]
words = ' '.join([re.sub('([" "]){2,}', ' ', sentence).strip() for sentence in sentences])

print words
"Four score and seven years ago our fathers brought forth on this continent a new nation conceived in liberty and dedicated to the proposition that all men are created equal Now we are engaged in a great civil war testing whether that nation or any nation so conceived and so dedicated can long endure"

答案 1 :(得分:1)

strip返回字符串,您应该捕获并应用剩余的条带。 所以你的代码应该改为

for i in list_str:
    i = i.strip('.')
    i = i.strip(',')
    i = i.strip('!')
    ....

在第二个音符上,strip仅在字符串的开头和结尾删除提到的字符。如果要删除字符串之间的字符,则应考虑replace

答案 2 :(得分:1)

您可以使用正则表达式,如this question中所述。从本质上讲,

import re

i = re.sub('[.,!]', '', i)

答案 3 :(得分:0)

如前所述,您需要将i.strip()分配给i。如前所述,替换方法更好。以下是使用replace方法的示例:

def list_of_words(list_str:list)->list:
    m=[]
    for i in list_str:
        i = i.replace('.','')
        i = i.replace(',','')
        i = i.replace('!','')
        m.extend(i.split())
    return m

print(list_of_words([ "Four score and seven years ago, our fathers brought forth on",
  "this continent a new nation, conceived in liberty and dedicated",
  "to the proposition that all men are created equal.  Now we are",
  "   engaged in a great        civil war, testing whether that nation, or any",
  "nation so conceived and so dedicated, can long endure! ])

您可以注意到,我还将m=m+i.split()替换为m.append(i.split()),以便于阅读。

答案 4 :(得分:0)

最好不要依赖自己的标点符号列表,但使用python的一个,而其他人有指针,使用正则表达式删除字符:

punctuations = re.sub("[`']", "", string.punctuation)
i = re.sub("[" + punctuations + "]", "", i)

还有string.whitespace,虽然拆分会为你照顾它们。