如何使用Python从URL中删除查询字符串

时间:2011-10-12 02:19:38

标签: python http cgi urlparse

示例:

http://example.com/?a=text&q2=text2&q3=text3&q2=text4

删除“ q2 ”后,它将返回:

http://example.com/?q=text&q3=text3

在这种情况下,有多个“ q2 ”并且已全部删除。

8 个答案:

答案 0 :(得分:54)

import sys

if sys.version_info.major == 3:
    from urllib.parse import urlencode, urlparse, urlunparse, parse_qs
else:
    from urllib import urlencode
    from urlparse import urlparse, urlunparse, parse_qs

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
u = urlparse(url)
query = parse_qs(u.query)
query.pop('q2', None)
u = u._replace(query=urlencode(query, True))
print(urlunparse(u))

答案 1 :(得分:17)

import urlparse

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
urlparse.urljoin(url, urlparse.urlparse(url).path)

答案 2 :(得分:10)

使用python的url操作库 furl

import furl
f = furl.furl("http://example.com/?a=text&q2=text2&q3=text3&q2=text4")
f.remove(['q2'])
print(f.url)

答案 3 :(得分:2)

import cgi
import urlparse
url = "http://example.com/?a=text&q2=text2&q3=text3&q2=text4"
qs = cgi.parse_qs(urlparse.urlparse(url)[4])
del(qs['q2'])
print qs

...明确

>>> import cgi
>>> import urlparse
>>> url = "http://example.com/?a=text&q2=text2&q3=text3&q2=text4"
>>> qs = cgi.parse_qs(urlparse.urlparse(url)[4])
>>> del(qs['q2'])
>>> print qs
{'a': ['text'], 'q3': ['text3']}
>>>

答案 4 :(得分:2)

这不仅仅是在字符上分割字符串的问题吗?

>>> url = http://example.com/?a=text&q2=text2&q3=text3&q2=text4
>>> url = url.split('?')[0]
'http://example.com/'

答案 5 :(得分:1)

query_string = "https://example.com/api/api.php?user=chris&auth=true"
url = query_string[:query_string.find('?', 0)]

答案 6 :(得分:0)

或者简单地说,只需使用url_query_cleaner()中的w3lib.url

from w3lib.url import url_query_cleaner

url = 'http://example.com/?a=text&q2=text2&q3=text3&q2=text4'
url_query_cleaner(url, ('q2'), remove=True)

输出:http://example.com/?a=text&q3=text3

答案 7 :(得分:-3)

import re
q ="http://example.com/?a=text&q2=text2&q3=text3&q2=text4"
todelete="q2"
#Delete every query string matching the pattern
r = re.sub(r''+todelete+'=[a-zA-Z_0-9]*\&*',r'',q)
#Delete the possible trailing #
r = re.sub(r'&$',r'',r)

print r