编写带有上述字符串列表的函数list_of_words,并返回删除了所有空格和标点符号的单个单词列表(撇号/单引号除外)。
我的代码删除句点和空格,但不删除逗号或感叹号。
def list_of_words(list_str):
m = []
for i in list_str:
i.strip('.')
i.strip(',')
i.strip('!')
m = m+i.split()
return m
print(list_of_words(["Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated",
"to the proposition that all men are created equal. Now we are",
" engaged in a great civil war, testing whether that nation, or any",
"nation so conceived and so dedicated, can long endure!"])
答案 0 :(得分:2)
清除一些标点符号和多个空格的最简单方法之一就是使用re.sub
函数。
import re
sentence_list = ["Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated",
"to the proposition that all men are created equal. Now we are",
" engaged in a great civil war, testing whether that nation, or any",
"nation so conceived and so dedicated, can long endure!"]
sentences = [re.sub('([,.!]){1,}', '', sentence).strip() for sentence in sentence_list]
words = ' '.join([re.sub('([" "]){2,}', ' ', sentence).strip() for sentence in sentences])
print words
"Four score and seven years ago our fathers brought forth on this continent a new nation conceived in liberty and dedicated to the proposition that all men are created equal Now we are engaged in a great civil war testing whether that nation or any nation so conceived and so dedicated can long endure"
答案 1 :(得分:1)
strip
返回字符串,您应该捕获并应用剩余的条带。
所以你的代码应该改为
for i in list_str:
i = i.strip('.')
i = i.strip(',')
i = i.strip('!')
....
在第二个音符上,strip
仅在字符串的开头和结尾删除提到的字符。如果要删除字符串之间的字符,则应考虑replace
答案 2 :(得分:1)
您可以使用正则表达式,如this question中所述。从本质上讲,
import re
i = re.sub('[.,!]', '', i)
答案 3 :(得分:0)
如前所述,您需要将i.strip()
分配给i
。如前所述,替换方法更好。以下是使用replace方法的示例:
def list_of_words(list_str:list)->list:
m=[]
for i in list_str:
i = i.replace('.','')
i = i.replace(',','')
i = i.replace('!','')
m.extend(i.split())
return m
print(list_of_words([ "Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated",
"to the proposition that all men are created equal. Now we are",
" engaged in a great civil war, testing whether that nation, or any",
"nation so conceived and so dedicated, can long endure! ])
您可以注意到,我还将m=m+i.split()
替换为m.append(i.split())
,以便于阅读。
答案 4 :(得分:0)
最好不要依赖自己的标点符号列表,但使用python的一个,而其他人有指针,使用正则表达式删除字符:
punctuations = re.sub("[`']", "", string.punctuation)
i = re.sub("[" + punctuations + "]", "", i)
还有string.whitespace
,虽然拆分会为你照顾它们。