Question

我有以下网址：

http://google.com/sadfasdfsd$AA=mytag&SS=sdfsdf

Python从字符串mytag获取~$AA=mytag&~的最佳方法是什么？

Answer 1

使用此正则表达式=(.+)&

import re
regex = "=(.+)&"
print re.findall(regex,"http://google.com/sadfasdfsd$AA=mytag&SS=sdfsdf")[0]

Answer 2

要检索mytag之后的$AA，您可以使用这个简单的正则表达式（请参阅demo）：

(?<=\$AA=)[^&]+

在Python中：

match = re.search(r"(?<=\$AA=)[^&]+", subject)

解释正则表达式

(?<=                     # look behind to see if there is:
  \$                     #   '$'
  AA=                    #   'AA='
)                        # end of look-behind
[^&]+                    # any character except: '&' (1 or more times
                         # (matching the most amount possible))

Answer 3

试试这个，

>>> import re
>>> str = 'http://google.com/sadfasdfsd$AA=mytag&SS=sdfsdf'
>>> m = re.search(r'.*\$AA=([^&]*)\&.*', str)
>>> m.group(1)
'mytag'

正则表达式中$和&有一个特殊含义，因此您必须转义这些字符才能告诉python解释器这些字符是文字$和& 。

Answer 4

我只想把这个扔出去，以表明还有其他方法可以做到这一点：

import urlparse

url = "http://google.com/sadfasdfsd?AA=mytag&SS=sdfsdf"
query = urlparse.urlparse(url).query # Extract the query string from the full URL
parsed_query = urlparse.parse_qs(query) # Parses the query string into a dict

print parsed_query["AA"][0]
# mytag

有关urlparse模块的文档，请参阅此处：https://docs.python.org/2/library/urlparse.html。

NB parse_qs会返回一个列表，因此我们使用[0]来获取第一个结果。

另外，我假设问题有一个拼写错误，并修改了网址，以便它代表传统的查询字符串。

Python从URL获取标签

4 个答案: