python:我似乎无法从HTML中获取信息

时间:2017-08-03 17:49:27

标签: python beautifulsoup request urllib

我有很多麻烦,指出我需要指定从网站上从这个html中提取这些部分。我没有问题,如果我在脚本中硬编码html,但当我尝试从网站上提取信息时,我得到以下错误。

HARDCODED SCRIPT WORKING -

html = "" #hardcoded the html form site
soup = BeautifulSoup(html, 'lxml')
div = soup.find_all("div",{"class":"fp-root"})  
productid = [x['data-product-id'] for x in div]
print(productid)

代码 -

session = requests.Session()
product_url = 'http:/randomsite.com/something.html'
response = session.get(product_url)

soup1 = BeautifulSoup(response.url, 'lxml')
div1 = soup1.find_all("div",{"class":"fp-root"})  
all_sizes1 = [x['data-product-id'] for x in div1]
print(all_sizes1)

错误 -

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bs4/__init__.py:282: UserWarning: "http://randomsite.com/something.html" looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an HTTP client like requests to get the document behind the URL, and feed that document to Beautiful Soup.
  ' that document to Beautiful Soup.' % decoded_markup

所以我尝试使用urllib

我得到了503错误,不知道为什么我有这个问题,但是我迷路了,不知道我是否能从结论中得到进一步的结论,或者我是不是在正确的道路上。任何帮助和解释将不胜感激。

代码 -

from bs4 import BeautifulSoup
import requests
import urllib.request
import os
os.system('clear')


url = urllib.request.urlopen('http:/randomsite.com/something.html')
pull = url.read()
print(pull)

soup = BeautifulSoup(pull, 'lxml')
div = soup.find_all("div",{"class":"fp-root"})  
all_sizes = [x['data-product-id'] for x in div]
print(all_sizes)

错误 -

urllib.error.HTTPError: HTTP Error 503: Service Unavailable

HTML -

<div class="atg_store_sizePicker size-label">
    <label class="atg_store_pickerLabel">
    Size
    </label>

      <div class="fp-root fp-align" data-product-id="505316967"></div>

      <!-- Check if the fit predictor is enabled for current site or not -->

    <script type="text/javascript" src="/javascript/fitpredictor.js"></script>


    <span id="fp-data" data-securitystatus="0" data-productid="505316967" data-customerid="678380871"
        data-market="US" data-language="en" data-shippingcountry="US"
        data-page="pdp" data-brand="adidas" data-productname="Men's Suede Gazelle Sneakers " data-category="BNY-mens-sneakers" data-minprice="150.0"
        data-maxprice="150.0" data-color="CREAM"></span>
        <input type="hidden" id="fp_allSizes" value='["6 M","7 M","7.5 M","8 M","8.5 M","9 M","9.5 M","10 M","10.5 M","11 M","11.5 M","12 M","13 M"]' />
        <input type="hidden" id="fp_availableSizes" value='["6 M","7 M","7.5 M","8 M","8.5 M","9 M","9.5 M","10 M","10.5 M","11 M","11.5 M","13 M"]' />

    <span class="selector">

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169686" data-onhand-quantity="1" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             6 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169693" data-onhand-quantity="1" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             7 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169785" data-onhand-quantity="1" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             7.5 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169709" data-onhand-quantity="1" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             8 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169792" data-onhand-quantity="2" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="2" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             8.5 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169716" data-onhand-quantity="5" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="5" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             9 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169808" data-onhand-quantity="3" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="3" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             9.5 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169723" data-onhand-quantity="4" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="4" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             10 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169815" data-onhand-quantity="3" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="3" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             10.5 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169730" data-onhand-quantity="3" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="3" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             11 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169822" data-onhand-quantity="1" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             11.5 M
            </a>

            <a class=" atg_store_oneSize disabled-size sizePicker " data-productid="505316967"  data-skuid="00505053169747" data-onhand-quantity="0" 
                      data-onorder-quantity="0" data-availabilitystatus="1001" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-currentcommerceitemid="" data-isprivate="0" 
                      data-atp="0" data-gwp="0" data-vendorcolor="CREAM" 
                      data-product-current-site="BNY" href="javascript:void(0)">
             12 M
            </a>

            <a class=" atg_store_oneSize sizePicker " data-productid="505316967"  data-skuid="00505053169754" data-onhand-quantity="2" 
                      data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0" 
                      data-list-price="150.0" data-on-sale="false" data-atp="2" data-isprivate="0" 
                      data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid="" 
                      data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true"  data-is-ap-enabled-for-product="true" href="javascript:void(0)">
             13 M
            </a>

    </span>    
  </div>

0 个答案:

没有答案