如何从重定向链接中抓取网址?

时间:2020-05-01 15:34:46

标签: web-scraping beautifulsoup python-requests

我正在尝试从页面获取链接,我已经获得了包含在按钮中的data-url,单击该按钮时,网站会加载URL = something.com/api?call=XXXXXX&auth=XXX

然后转到真实的网站anotherweb.com

所以我想,如果我请求URL,我可能会去anotherweb.com,并且成功了!

代码:

import requests
import urllib.error , urllib.request , urllib.parse
#import time
from bs4 import BeautifulSoup

url = input('https://nova.egybest.bid/movie/extraction-2020')

id = url.split('/')[2]



url = requests.get(url).text
api_urls = []

soup = BeautifulSoup( url ,'lxml' )
table_url = soup.find('table' , class_='dls_table btns full mgb')
all = table_url.find_all('a' , class_= 'nop btn g dl _open_window')
for link in all:
   api_url = link['data-url']
   api_urls.append(api_url)

#Query para [call , auth]
for req in api_urls :
    http = 'http://' + id
    #time.sleep(4)
    new_url = requests.get(http + req)
    #time.sleep(3)
    print(new_url.url)

一段时间后它不起作用,而是程序打印id(加载主页)

有什么方法可以获取实际的网址anotherweb.com

注意:id是页面域something.com

0 个答案:

没有答案