Question

我试图从A字符串中删除所有标点符号，但每当我运行程序时都没有任何反应......这是我的代码：

#OPEN file (a christmas carol)
inputFile = open('H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt')
carolText = inputFile.read()



#CONVERT everything into lowercase
for line in carolText:
       carolTextlower = carolText.lower()

#REMOVE punctuation (Put a space instead of a hyphened word or apostrophe)
import string
exclude = set(string.punctuation)
noPunctu = carolTextlower.join(ch for ch in carolTextlower if ch not in exclude)
print(noPunctu)

当我运行程序时，没有任何内容出现

Answer 1

使用以下代码检查：

import string

inputFile = open('H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt')
carolText = inputFile.read()

for c in string.punctuation:
    carolText=carolText.replace(c,"")

carolText

Answer 2

以下是打开文件，替换其中的某个字符以及再次将所有内容写入新文件的方法。

to_replace = '-'  # Hyphen
replace_by = ' '  # Space

# Reading the file to be modified.
with open('file.txt', 'r') as file:
    # Modifying the contents as the file is being read.
    new_file = [line.replace(to_replace, replace_by) for line in file]

# Writing the contents, both modified and untouched ones, in a new file. 
with open('file_modified.txt', 'w') as file:
    for item in new_file:
        print(item, file=file, end='\n')

Answer 3

这可以使用Python的translate函数完成。该代码生成一个表，将任何大写字符映射到其匹配的小写字符，并将任何标点字符转换为空格。这是在整个文本的单个调用中完成的，所以它非常快：

import string

def process_text(s):
    return s.translate(
        str.maketrans(
            string.punctuation + string.ascii_uppercase, 
            " " * len(string.punctuation) + string.ascii_lowercase)).replace("  ", " ")

with open(r'H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt') as inputFile:
    print(process_text(inputFile.read()))

Answer 4

这是修复后的代码版本。

import string

#OPEN file (a christmas carol)
inputFile = open(r'H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt')
carolText = inputFile.read()
inputFile.close()

#CONVERT everything into lowercase
carolTextlower = carolText.lower()

#REMOVE punctuation 
exclude = set(string.punctuation)
noPunctu = ''.join(ch for ch in carolTextlower if ch not in exclude)
print(noPunctu)

通常的Python惯例是将import语句放在脚本的顶部，这样它们就很容易找到。

请注意，我在文件名中使用了原始字符串（在开头引号之前用r表示）。这里不一定非常必要，但它可以防止Windows路径中的反斜杠序列被解释为转义序列。例如，在'H:\Documents\new\test.py'中，\n将被解释为换行符，\t将被解释为制表符。

在您完成阅读（或写入）文件后，您确实应该关闭文件。但是，最好使用with关键字打开文件：即使出现错误，也可以确保文件正确关闭。例如，

filename = r'H:\Documents\Computing\GCSE COMPUTING\Revision\Practice Prog/christmascarol.txt'
with open(filename) as inputFile:
    carolText = inputFile.read()

如何从Python中的字符串中删除标点符号？

4 个答案: