ValueError:使用urllib对已关闭文件执行I / O操作

时间:2018-06-17 09:32:59

标签: python urllib

我在搜索来自seekalpha网站的数据时遇到了问题。我知道到目前为止已经多次询问过这个问题,但所提供的解决方案没有帮助

我有以下代码块:

class AppURLopener(urllib.request.FancyURLopener):
    version = "Mozilla/5.0"


def scrape_news(url, source):
    opener = AppURLopener()
    if(source=='SeekingAlpha'):
        print(url)
        with opener.open(url) as response:
            s = response.read()
            data = BeautifulSoup(s, "lxml")
            print(data)

scrape_news('https://seekingalpha.com/news/3364386-apple-confirms-hiring-waymo-senior-engineer','SeekingAlpha')

知道这里可能出现什么问题吗?

编辑: 整个追溯:

Traceback (most recent call last):
  File ".\news.py", line 107, in <module>
    scrape_news('https://seekingalpha.com/news/3364386-apple-confirms-hiring-waymo-senior-engineer','SeekingAlpha')
  File ".\news.py", line 83, in scrape_news
    with opener.open(url) as response:
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\urllib\response.py", line 30, in __enter__
    raise ValueError("I/O operation on closed file")
ValueError: I/O operation on closed file

1 个答案:

答案 0 :(得分:2)

您的网址返回403.请在终端中尝试确认:

curl -s -o /dev/null -w "%{http_code}" https://seekingalpha.com/news/3364386-apple-confirms-hiring-waymo-senior-engineer

或者,在Python repl中尝试这个:

import urllib.request

url = 'https://seekingalpha.com/news/3364386-apple-confirms-hiring-waymo-senior-engineer'
opener = urllib.request.FancyURLopener()
response = opener.open(url)

print(response.getcode())

FancyURLOpener正在吞下有关失败响应代码的任何错误,这就是为什么您的代码会继续response.read()而不是退出,即使它没有记录有效的响应。标准urllib.request.urlopen应该通过在403错误上抛出异常来为您处理,否则您可以自己处理它。