如开发人员/检查人员工具所示,Beautiful Soup不返回网页

时间:2019-07-16 13:44:26

标签: python-3.x beautifulsoup

我正在使用beautifulsoup在获取请求后解析响应。但是,结果与我在Chrome的检查器工具中看到的完全不同。

我正在尝试用漂亮的汤刮擦从https://www.nofrills.ca/search/?search-bar=Basil返回的数据。我正在使用请求库和bs4库。

page = requests.get("https://www.nofrills.ca/search/?search-bar=Basil")
soup = BeautifulSoup(page.content, 'html.parser')

但是我得到的却是:

<!DOCTYPE html>

<html>
<script type="text/javascript">var _ldPerfStart = Date.now();</script>
<head>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1.0, maximum-scale=1.0" name="viewport"/>
<meta content="app-id=1194066746, affiliate-data=ct=smart-app-banner&amp;pt=1384326" name="apple-itunes-app"/>
<meta content="app-id=pc.express.grocery.pickup&amp;hl=en_CA" name="google-play-app"/>
<script async="true" type="text/javascript">
            function targetPageParams() { 
                return { 
                    "site": "nofrills",
                }; 
            }; 
        </script>
<script async="true" src="https://d3rzy2hoo29vi.cloudfront.net/assets/js/at_v1.js" type="text/javascript"></script>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/futura/futuraStd-heavy.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/univers/2B816F_0_0.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/univers/2B816F_1_0.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/univers/2B816F_2_0.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/univers/2B816F_3_0.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/univers/2B816F_4_0.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/icons/v1/grocery-icons.woff2" rel="preload" type="font/woff2"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/noir/Noir_Std.otf" rel="preload" type="font/opentype"/>
<link as="font" crossorigin="crossorigin" href="https://d3rzy2hoo29vi.cloudfront.net/fonts/noir/Noir_Std_Semi_Bold.otf" rel="preload" type="font/opentype"/>
<link href="https://assets.shop.loblaws.ca" rel="preconnect"/>
<link href="https://assets.adobedtm.com" rel="preconnect"/>
<link href="https://d3rzy2hoo29vi.cloudfront.net/builds/production/1.1.36/6f1d2a5e/nofrills-bundle.css" rel="stylesheet" type="text/css"/>
<link href="https://assets.shop.loblaws.ca/ContentMedia/nfr/logos/64x64icon.ico" media="all" rel="shortcut icon" type="image/x-icon">
<script async="true" src="https://assets.adobedtm.com/ec12e179889c41354087f1ac19e02839d7c19f0e/satelliteLib-105d18abadb6163816bb52f11c640058778e502a.js" type="text/javascript"></script>
<script>
            // Google Tag Manager
            (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-NPWHZ7F');
        </script>
</link></head>
<body>
<div data-customer-pickuplocation-id="" data-default-pickup-location="0730" data-enabled-bronx-page-ids="search-results,search-results-no-results,productDetails,subcategory,cartReviewPage" data-forgot-password-flyout="" data-user-creation-flyout="" id="bronx-data"></div>
<div id="root"></div>
<script src="https://d3rzy2hoo29vi.cloudfront.net/builds/production/1.1.36/6f1d2a5e/nofrills-bundle.js" type="text/javascript"></script>
</body>
</html>

1 个答案:

答案 0 :(得分:0)

页面执行您可以在网络中找到的API请求,并返回json。需要指定一个标头。传递响应以获取所需的任何信息。

import requests

r = requests.get('https://www.nofrills.ca/api/product/search/basil?pageSize=24', headers =  {'Site-Banner' : 'nofrills'}).json()

例如拉出名称

names = [item['name'] for item in r['results']]
print(names)