带字符串关键字替换的URL调用

时间:2014-07-22 23:47:06

标签: python parsing url urllib2

import urllib2
from xml.dom import minidom

query = """http://phx01.companyA.com:8000/?query=kindle+fire&country_id=1&lang_id=1&linkin_id=8073631&sbh_id=120555,1&weight_group_id=80&request_id=p20.b6894b85df2f81d54003&brand_id
=14623&request_type=SRS&............."""    #multiple lines, a lot content omitted

response = urllib2.urlopen("".join(query.split('\n')))
dom = minidom.parse(response)

上面的示例查询已经过了(我在上面的链接中用companyA替换了真实的公司名称。)

现在,如果我要将kindle fire替换为iphone5,如何制作呢? 我以为我可以做到类似

keyword = "iphone5"
"""...... query = %s.............""" %(keyword)

但它失败了,我怀疑它可能与关键字编码有关,但是如何在这里做到这一点?

1 个答案:

答案 0 :(得分:3)

添加占位符{query}并使用format()填写:

import urllib

query = """http://phx01.companyA.com:8000/?query={query}&country_id=1&lang_id=1&linkin_id=8073631&sbh_id=120555,1&weight_group_id=80&request_id=p20.b6894b85df2f81d54003&brand_id=14623&request_type=SRS"""

value = "kindle fire"
query = query.format(query=urllib.quote_plus(value))

请注意,您需要使用urllib.quote_plus()对值进行编码。这就是它的作用:

>>> import urllib
>>> value = "kindle fire"
>>> urllib.quote_plus(value)
'kindle+fire'

或者,您可以创建查询参数字典,然后urlencode()

>>> import urllib
>>> value = "kindle fire"
>>> params = {'query': value, 'country_id': '1'}
>>> urllib.urlencode(params)
'query=kindle+fire&country_id=1'

您甚至可以使用urlparse解析网址,使用parse_qsl()获取查询参数,设置相应的query参数值和urlencode()参数:

>>> url = """http://phx01.companyA.com:8000/?query=kindle+fire&country_id=1&lang_id=1&linkin_id=8073631&sbh_id=120555,1&weight_group_id=80&request_id=p20.b6894b85df2f81d54003"""
>>> params = urlparse.urlparse(url).query                                                                                                                     
>>> params = urlparse.parse_qsl(params)
>>> params
[('query', 'kindle fire'), ('country_id', '1'), ('lang_id', '1'), ('linkin_id', '8073631'), ('sbh_id', '120555,1'), ('weight_group_id', '80'), ('request_id', 'p20.b6894b85df2f81d54003')]
>>> params = dict(params)
>>> urllib.urlencode(params)  
'linkin_id=8073631&country_id=1&lang_id=1&weight_group_id=80&request_id=p20.b6894b85df2f81d54003&query=kindle+fire&sbh_id=120555%2C1'
>>> params['query'] = 'iphone'
>>> urllib.urlencode(params)
'linkin_id=8073631&country_id=1&lang_id=1&weight_group_id=80&request_id=p20.b6894b85df2f81d54003&query=iphone&sbh_id=120555%2C1'