Question

我正在使用 arxiv API 来使用python搜索学术论文。对于单项查询，arxiv API可以很好地工作，但对于多项查询（关键短语），API仅采用第一项。

例如：

 import urllib.request as ur
 from bs4 import BeautifulSoup

 url = 'http://export.arxiv.org/api/query?search_query=all:electron'
 s = ur.urlopen(url)
 sl = s.read()
 soup = BeautifulSoup(sl, 'html.parser')
 papers=[soup.find_all('title')]
 print(soup)

输出（打印汤变量）

在这里，我使用了查询词 electron ，Arxiv API搜索也使用了电子术语（突出显示）。

但是我使用查询术语说黑洞的量子复杂性，arxiv API仅采用了第一个单词（量子）。

import urllib.request as ur
from bs4 import BeautifulSoup

url = 'http://export.arxiv.org/api/query?search_query=all:quantum complexity of a black hole'
#url='http://export.arxiv.org/api/query?search_query=ti:"quantum complexity of a black hole"&sortBy=lastUpdatedDate&sortOrder=ascending'
s = ur.urlopen(url)
sl = s.read()
soup = BeautifulSoup(sl, 'html.parser')
print(soup)

输出：

我如何使用整个关键字（黑洞的量子复杂性）进行搜索，以便返回包含这些关键字的学术论文？

Answer 1

您将必须对查询参数进行编码

import urllib.parse
import urllib.request as ur
from bs4 import BeautifulSoup
query = urllib.parse.quote("all:quantum complexity of a black holeu")
url = 'http://export.arxiv.org/api/query?search_query=' + query
s = ur.urlopen(url)
sl = s.read()
soup = BeautifulSoup(sl, 'html.parser')
print(soup)

Arxiv API无法使用整个查询字词

1 个答案: