我是一名学习生物学的非学生,但我正在研究python数据科学,以便进行网络学习Google学术搜索。我创建了一个最初工作的程序,但它以某种方式随机停止工作并给了我一个值Error。我认为这可能与谷歌严格搜索他们的网站的机器人有关。任何建议和补救措施都会有所帮助!我正在使用Jupyter Notebook ipython和Python3。
代码:
import pip
def install(package):
pip.main(['install', package])
install('BeautifulSoup4')
from bs4 import BeautifulSoup
import urllib.request
from urllib.request import FancyURLopener
class AppURLopener(urllib.request.FancyURLopener):
version = "Mozilla/5.0"
def page_citations(x):
#number of pages of google searches that you want to run
query = input()
query = str(query)
opener = AppURLopener()
m = 0
q = 0
l = make_array()
while m < x:
response =
opener.open('https://scholar.google.com/scholar?
start='+str(q)+'&q=' + query + '&hl=en&as_sdt=0,5').read()
soup = BeautifulSoup(response, 'html.parser')
for word in str(soup.find_all(class_ = "gs_fl")).split():
if word.endswith(''+ '</a>'):
l = np.append(l, word.strip('</a>'))
q = q + 10
m = m + 1
n = make_array()
for number in l:
try:
number = int(number)
n = np.append(n, number)
except: continue
return n
错误: ValueError:读取已关闭的文件