我正在学习有关网络爬网的知识,我想对IEEE中的所有期刊进行爬网,但是搜索应该是输入关键字
最后,我在此网址中找到了“高级搜索”
https://ieeexplore.ieee.org/search/advsearch.jsp?expression-builder
并输入以下命令,我们可以获取期刊论文
“出版物标题”:期刊
现在我输入一些要爬网的代码,响应显示<200>,但其中没有任何内容
请求网址是
request_url = 'https://ieeexplore.ieee.org/rest/search'
标题是
headers = {
'Content-Type': 'application/json', # here if i don't add this term, the response will show 403
# and I also change the user-agent like the online teaching,
but it's still nothing
}
有效负载与“请求有效负载”中显示的有效负载
payload = {
"action":"search",
"matchBoolean":True,
"searchField":"Search_All",
"queryText":"(\"Publication Title\":journal)",
"newsearch":True,
"highlight":True,
"returnFacets":["ALL"],
"returnType":"SEARCH"
}
最后我们要求
res = requests.post(url= request_url, data=payload, headers = headers) # response<200>
test = res.text # here the content is empty ?
# here the test result = ''
反正有解决这个问题的方法吗?