如何从列表中删除标点符号

时间:2018-12-29 08:19:11

标签: python beautifulsoup preprocessor data-cleaning scrape

有人可以帮我这个代码吗 现在我得到的'test2'和'test'一样,如果我测试的是字符串,它可以正常工作,但是作为列表,它不能正常工作

 punc = set(string.punctuation)
 test=[" course content good though textbook badly written.not got energy 
 though seems good union.it distance course i'm sure facilities.n/an/ain 
last year offer poor terms academic personal. seems un become overwhelmed trying become run"]

test2 = ''.join(w for w in test if w not in punc)
 print(test2)

我要删除所有标点符号

3 个答案:

答案 0 :(得分:1)

由于test是一个列表,因此“ for w in test”将返回列表的第一项,即完整的字符串。因此,您需要访问“ w”的所有项以实际测试字符串的所有单个字符。

答案 1 :(得分:1)

import string
test=[" course content good though textbook badly written.not got energy though seems good union.it distance course i'm sure facilities.n/an/ain last year offer poor terms academic personal. seems un become overwhelmed trying become run"]
test2 = ''.join(w for w in test[0] if w not in string.punctuation )
print(test2)

如果列表中有多个字符串

import string
test=["Hi There!"," course content good though textbook badly written.not got energy though seems good union.it distance course i'm sure facilities.n/an/ain last year offer poor terms academic personal. seems un become overwhelmed trying become run"]
#if there are multiple string in the list
for x in test:
    print(''.join(w for w in x if w not in string.punctuation ))
# If there are multiple strings in the list and you want to join all of them togather
print(''.join(w for w in [x for x in test] if w not in string.punctuation )) 

如果需要将其附加到列表变量

import string
test2=[]
test=["Hi There!"," course content good though textbook badly written.not got energy though seems good union.it distance course i'm sure facilities.n/an/ain last year offer poor terms academic personal. seems un become overwhelmed trying become run"]
#if there are multiple string in the list
for x in test:
    test2.append(''.join(w for w in x if w not in string.punctuation ))
print(test2)

答案 2 :(得分:0)

最快(可能是最Python化的方式)是使用翻译。

import string
test=["Hi There!"," course content good though textbook badly written.not got energy though seems good union.it distance course i'm sure facilities.n/an/ain last year offer poor terms academic personal. seems un become overwhelmed trying become run"]

# Create a translate table that translates all punctuation to nothing
transtable = {ord(a):None for a in string.punctuation}

# Apply the translation to all strings in the list
[s.translate(transtable) for s in test]