如何从Python中的字符串中删除标点符号?

时间:2016-02-08 10:49:22

标签: python

我试图从A字符串中删除所有标点符号,但每当我运行程序时都没有任何反应......这是我的代码:

#OPEN file (a christmas carol)
inputFile = open('H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt')
carolText = inputFile.read()



#CONVERT everything into lowercase
for line in carolText:
       carolTextlower = carolText.lower()

#REMOVE punctuation (Put a space instead of a hyphened word or apostrophe)
import string
exclude = set(string.punctuation)
noPunctu = carolTextlower.join(ch for ch in carolTextlower if ch not in exclude)
print(noPunctu)

当我运行程序时,没有任何内容出现

4 个答案:

答案 0 :(得分:0)

使用以下代码检查:

import string

inputFile = open('H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt')
carolText = inputFile.read()

for c in string.punctuation:
    carolText=carolText.replace(c,"")

carolText

答案 1 :(得分:0)

以下是打开文件,替换其中的某个字符以及再次将所有内容写入新文件的方法。

to_replace = '-'  # Hyphen
replace_by = ' '  # Space

# Reading the file to be modified.
with open('file.txt', 'r') as file:
    # Modifying the contents as the file is being read.
    new_file = [line.replace(to_replace, replace_by) for line in file]

# Writing the contents, both modified and untouched ones, in a new file. 
with open('file_modified.txt', 'w') as file:
    for item in new_file:
        print(item, file=file, end='\n')

答案 2 :(得分:0)

这可以使用Python的translate函数完成。该代码生成一个表,将任何大写字符映射到其匹配的小写字符,并将任何标点字符转换为空格。这是在整个文本的单个调用中完成的,所以它非常快:

import string

def process_text(s):
    return s.translate(
        str.maketrans(
            string.punctuation + string.ascii_uppercase, 
            " " * len(string.punctuation) + string.ascii_lowercase)).replace("  ", " ")

with open(r'H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt') as inputFile:
    print(process_text(inputFile.read()))

答案 3 :(得分:0)

这是修复后的代码版本。

import string

#OPEN file (a christmas carol)
inputFile = open(r'H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt')
carolText = inputFile.read()
inputFile.close()

#CONVERT everything into lowercase
carolTextlower = carolText.lower()

#REMOVE punctuation 
exclude = set(string.punctuation)
noPunctu = ''.join(ch for ch in carolTextlower if ch not in exclude)
print(noPunctu)

通常的Python惯例是将import语句放在脚本的顶部,这样它们就很容易找到。

请注意,我在文件名中使用了原始字符串(在开头引号之前用r表示)。这里不一定非常必要,但它可以防止Windows路径中的反斜杠序列被解释为转义序列。例如,在'H:\Documents\new\test.py'中,\n将被解释为换行符,\t将被解释为制表符。

在您完成阅读(或写入)文件后,您确实应该关闭文件。但是,最好使用with关键字打开文件:即使出现错误,也可以确保文件正确关闭。例如,

filename = r'H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt'
with open(filename) as inputFile:
    carolText = inputFile.read()