Question

我正在尝试进行亵渎性检查。我到目前为止编写的代码是

import urllib.request  

def read_text ():
    file = open (r"C:\Users\Kashif\Downloads\abc.txt")
    file_print = file.read ()
    print (file_print)
    file.close ()
    check_profanity (file_print)

def check_profanity (file_print):
    connection = urllib.request.urlopen ("http://www.purgomalum.com/service/containsprofanity?text="+file_print)
    output = connection.read ()
    print ("The Output is "+output)
    connection.close ()
    read_text ()

但我收到以下错误

urllib.error.HTTPError：HTTP错误400：错误请求

我不知道我错了什么。

我正在使用Python 3.6.1

Answer 1

您获取的HTTP错误通常表明您向服务器请求数据的方式有问题。根据{{3}}：

400错误请求

由于格式错误，服务器无法理解该请求   句法。客户端不应在没有修改的情况下重复请求

在您的示例中具体来说，问题似乎是您在URL中发送的数据缺少URL编码。您应该尝试使用HTTP Spec模块中的quote_plus方法来接受您的请求：

from urllib.parse import quote_plus

...

encoded_file_print = quote_plus(file_print)
url = "http://www.purgomalum.com/service/containsprofanity?text=" + encoded_file_print
connection = urllib.request.urlopen(url)

如果这不起作用，则问题可能出在文件内容上。您可以先使用一个简单示例进行尝试，以验证您的脚本是否有效，然后尝试使用该文件的内容。

除了上述内容之外，您的代码还存在其他几个问题：

方法和括号之间不需要空格：file.close ()或def read_text ():等等。
读取内容后将内容转换为字符串：output = connection.read().decode('utf-8')
您调用方法的方式会产生循环依赖关系。 read_text调用check_profanity，最后调用read_text调用check_profanity等。删除额外的方法调用，只需使用return返回a的输出方法：
```
content = read_text()
has_profanity = check_profanity(content)
print("has profanity? %s" % has_profanity)
```

使用urllib.request时出现HTTP错误

1 个答案: