I am trying to search inside the response of a request (I used Requests and Python). I get the response and check the type of it, which is UNICODE.
I want to retrieve a specific link which is located between two other strings. I have tried different ways found online such as the:
result = re.**search**('Currently: <a ', s)
url_file = response.**find**('Currently: <a ', beg=0, end=len(response))
Also tried to transform the UNICODE string to a normal string:
s = unicodedata.normalize(response, title).encode('ascii','ignore')
I get an error.
EDITED
For example:
This works:
s = 'asdf=5;iwantthis123jasd'
result = re.search('asdf=5;(.*)123jasd', s)
print result.group(1)
This doesn't work (returns error):
s = 'Currently: <a '
result = re.search(r.text, s)
print result.group(1)
答案 0 :(得分:2)
You are using re.search
wrong. The first argument of the function is the pattern and the second one is the source string:
import re
import requests
s = '<a class=gb1 href=[^>]+>'
r = requests.get('https://www.google.com/?q=python')
result = re.search(s, r.text)
print result.group(0)
If you simply need the list of all matches you can use: re.findall(s, r.text)
答案 1 :(得分:1)
res = requests.get("http://google.com")
re.search('pattern', res.text)