从urllib2获取数据或在Python中获取请求

时间:2016-02-15 09:23:28

标签: python url request urllib2

我必须从python中获取搜索结果。

假设以下是我正在搜索heart关键字的目标网址。

http://journals.aps.org/search/results?clauses=[{%22operator%22:%22AND%22,%22field%22:%22all%22,%22value%22:%22heart%22}]&sort=relevance

urllib2request的搜索结果不会出现。

处理以下代码:

request

>>> import requests
>>> url = "http://journals.aps.org/search/results"
>>> payload = {"clauses": [{"operator":"AND","field":"all","value":"heart"}], "sort": "relevance"}
>>> r = requests.get(url, params=payload)
>>> b = r.text

urllib2

>>> import urllib2
>>> search_url ="http://journals.aps.org/search/results?clauses=%5B%7B%22operator%22%3A%22AND%22%2C%22field%22%3A%22all%22%2C%22value%22%3A%22heart%22%7D%5D&sort=relevance"
>>> 
>>> req = urllib2.Request(search_url, headers={'User-Agent' : "Mozilla/5.0"})
>>> f1 = urllib2.urlopen(req)

无法通过上述脚本获得正确的结果。

1 个答案:

答案 0 :(得分:0)

从你的问题不清楚你被困在哪里,这两个例子都适合我。您期望的“正确结果”是什么?

>>> import requests
>>> url = "http://journals.aps.org/search/results"
>>> payload = {"clauses": [{"operator":"AND","field":"all","value":"heart"}], "sort": "relevance"}
>>> r = requests.get(url, params=payload)
>>> print r.text[:40]
<!DOCTYPE html><!--[if IE 8]><html class
>>>>

>>> import urllib2
>>> search_url ="http://journals.aps.org/search/results?
>>> clauses=%5B%7B%22operator%22%3A%22AND%22%2C%22field%22%3A%22all%22%2C%22value%22%3A%22heart%22%7D%5D&sort=relevance"
>>> req = urllib2.Request(search_url, headers={'User-Agent' : "Mozilla/5.0"})
>>> f1 = urllib2.urlopen(req)
>>> print f1.read()[:40]
<!DOCTYPE html><!--[if IE 8]><html class
>>>