现在,我编写的脚本仅获得10个结果。我想将其增加到50个。
请问请求库有什么办法吗?我很抱歉将这些代码塞满了。当我最初编写它时,我不想在此方面进行协作,因此我省略了注释标签等。
我现在不知道要输入什么,但该网站称我的帖子大部分是代码,并要求我提供更多详细信息。这实际上是一个非常简单的问题。我没什么可以补充的。
任何人都知道如何设置参数以每页获取50个结果,而不是默认的10个吗?
这是我当前正在使用的代码:
#Check Connection
def connected_to_internet(url='http://www.google.com/', timeout=5):
try:
_ = requests.get(url, timeout=timeout)
except requests.ConnectionError:
print("No internet connection. Please connect to the internet and try again.")
exit()
connected_to_internet()
def clearscreen():
_ = call('clear')
# desktop user-agent
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"
# mobile user-agent
MOBILE_USER_AGENT = "Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36"
# Query user
print("What is the article about?")
query = input()
done = False
def animate():
for c in itertools.cycle(['|','/','-','\\']):
if done:
break
sys.stdout.write('\r'+c)
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write('\r')
t = threading.Thread(target=animate)
t.start()
print("")
time.sleep(1)
clearscreen()
print("Searching Google for information about: ",query)
squery = query.replace(' ', '+')
URL = f"https://google.com/search?q={squery}"
headers = {"user-agent": USER_AGENT}
resp = requests.get(URL, headers=headers)
#First-stage scrape of Google
if resp.status_code == 200:
soup = BeautifulSoup(resp.content, "html.parser")
results = []
#Grab all the URLs from the first page of SERPS
for g in soup.find_all('div', class_='r'):
anchors = g.find_all('a')
if anchors:
link = anchors[0]['href']
title = g.find('h3').text
item = {
link
}
results.append(item)
#Create list of urls and format to enable second scrape
listurls=str(results)
listurls=listurls.replace("[","")
listurls=listurls.replace("{","")
listurls=listurls.replace("'}","")
listurls=listurls.replace("'","")
listurls=listurls.replace("]","")
qresults=listurls.split(",")
#Identify the number of URLS for later comparison
numresults=len(qresults)