我正在尝试抓取网站上的数据。当我们从浏览器点击它时,有一个url重定向到另一个url但是当我尝试使用python脚本执行相同操作时,它不会重定向到它应该是的。 第一个网址是 - http://www.marriott.com/reservation/availabilitySearch.mi?propertyCode=ABZAP&isSearch=true&isRateCalendar=false&flexibleDateSearchRateDisplay=false&flexibleDateLowestRateMonth=&flexibleDateLowestRateDate=&fromDate=08/10/17&toDate=08/11/17&clusterCode=&corporateCode=&groupCode=&numberOfRooms=1&numberOfGuests=2&incentiveType_Number=&incentiveType=false&marriottRewardsNumber=&useRewardsPoints=false&numberOfChildren=0&childrenAges=&numberOfAdults=2
它重定向到 - http://www.marriott.com/reservation/rateListMenu.mi
我还在维护Cookie的会话,还包括所需的标头。 请看下面的代码 -
session.headers = ({
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0)Gecko/20100101 Firefox/45.0',
'Host': 'www.marriott.com',
'Connection': 'keep-alive',
'Referer': 'http://www.marriott.com/search/findHotels.mi',
'Accept': '*/*'})
second_page ='http://www.marriott.com/reservation/availabilitySearch.mi?propertyCode=ABZAP&isSearch=true&isRateCalendar=false&flexibleDateSearchRateDisplay=false&flexibleDateLowestRateMonth=&flexibleDateLowestRateDate=&fromDate=08/10/17&toDate=08/11/17&clusterCode=&corporateCode=&groupCode=&numberOfRooms=1&numberOfGuests=2&incentiveType_Number=&incentiveType=false&marriottRewardsNumber=&useRewardsPoints=false&numberOfChildren=0&childrenAges=&numberOfAdults=2'
response = session.get(second_page, headers=session.headers)
print(response.url)
当我打印网址时,它会打印原始网址并且永远不会被重定向,当我尝试打印第一页的内容时,会打印出来 -
<!doctype html>
<html>
<head><script type="text/javascript" src="/common/js/marriottCommon.js"> </script>
<meta charset="utf-8">
</head>
<body>
<script>
var xhttp = new XMLHttpRequest();
xhttp.addEventListener("load", function(a,b,c){
window.location.reload()
});
xhttp.open('GET', '/reservation/availabilitySearch.mi?istl_enable=true&istl_data', true);
xhttp.send();
</script>
</body>
请告诉我我在这里失踪了什么。