删除标点符号

时间:2013-03-17 02:23:07

标签: python punctuation

我需要配合标点符号功能,以便打印文件中的文字而不会有标点符号。排队:"How are you today?"

到目前为止,打印:

"how
are
you
today?"

但我想打印它:

how
are
you
today

我的代码如下所示:

from scanner import *
import sys
import string

def processFile(filename):
    s = Scanner(filename)
    token = s.readtoken()
    array = []
    while token != "":
        newToken = ""
        for i in range(0,len(token),1):
            newchar = RawChar(token[i])
            newToken = newToken + newchar
        array.append(newToken)
        token = s.readtoken()
    s.close()
    return array

def eachLine(tokens):
    for i in range(0,len(tokens),1):
        pun(tokens[i])
        print(tokens[i])
    return

def pun(string):
    punctuation = ["`","~","!","@","#","$","%","^","&","*","(",")","_","-","+","=","{","[","}","]","|",":",";","\"","'","<",",",">",".","?","/"]
    for i in string:
        newString = ""
        if i not in string:
            newString = newString + i
    return newString

def RawChar(char):
    if char == "A":
        char = "a"
    elif char == "B":
        char = "b"
    elif char == "C":
        char = "c"
    elif char == "D":
        char = "d"
    elif char == "E":
        char = "e"
    elif char == "F":
        char = "f"
    elif char == "G":
        char = "g"
    elif char == "H":
        char = "h"
    elif char == "I":
        char = "i"
    elif char == "J":
        char = "j"
    elif char == "K":
        char = "k"
    elif char == "L":
        char = "l"
    elif char == "M":
        char = "m"
    elif char == "N":
        char = "n"
    elif char == "O":
        char = "o"
    elif char == "P":
        char = "p"
    elif char == "Q":
        char = "q"
    elif char == "R":
        char = "r"
    elif char == "S":
        char = "s"
    elif char == "T":
        char = "t"
    elif char == "U":
        char = "u"
    elif char == "V":
        char = "v"
    elif char == "W":
        char = "w"
    elif char == "X":
        char = "x"
    elif char == "Y":
        char = "y"
    elif char == "Z":
        char = "z"
    return char

def main():
    newForm = processFile(sys.argv[1])
    eachLine(newForm)

main()

有关放置def pun(string)的位置的任何建议吗?

3 个答案:

答案 0 :(得分:7)

要从字符串中删除标点符号,请使用str.translate

In [124]: import string

In [126]: string.punctuation
Out[126]: '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [127]: '"How are you today?"'.translate(None, string.punctuation)
Out[127]: 'How are you today'

答案 1 :(得分:1)

您可以使用this stackoverflow article中显示的技术显着改善标点符号剥离。然后使用s.lower()来小写字符串s。

答案 2 :(得分:0)

import string
s = '"Right now!" she shouted, and hands fluttered in the air - amid a few cheers - for about two minutes.'
x = "".join([c for c in s if or c not in string.punctuation])