我有很多麻烦,指出我需要指定从网站上从这个html中提取这些部分。我没有问题,如果我在脚本中硬编码html,但当我尝试从网站上提取信息时,我得到以下错误。
HARDCODED SCRIPT WORKING -
html = "" #hardcoded the html form site
soup = BeautifulSoup(html, 'lxml')
div = soup.find_all("div",{"class":"fp-root"})
productid = [x['data-product-id'] for x in div]
print(productid)
代码 -
session = requests.Session()
product_url = 'http:/randomsite.com/something.html'
response = session.get(product_url)
soup1 = BeautifulSoup(response.url, 'lxml')
div1 = soup1.find_all("div",{"class":"fp-root"})
all_sizes1 = [x['data-product-id'] for x in div1]
print(all_sizes1)
错误 -
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bs4/__init__.py:282: UserWarning: "http://randomsite.com/something.html" looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an HTTP client like requests to get the document behind the URL, and feed that document to Beautiful Soup.
' that document to Beautiful Soup.' % decoded_markup
所以我尝试使用urllib
我得到了503错误,不知道为什么我有这个问题,但是我迷路了,不知道我是否能从结论中得到进一步的结论,或者我是不是在正确的道路上。任何帮助和解释将不胜感激。
代码 -
from bs4 import BeautifulSoup
import requests
import urllib.request
import os
os.system('clear')
url = urllib.request.urlopen('http:/randomsite.com/something.html')
pull = url.read()
print(pull)
soup = BeautifulSoup(pull, 'lxml')
div = soup.find_all("div",{"class":"fp-root"})
all_sizes = [x['data-product-id'] for x in div]
print(all_sizes)
错误 -
urllib.error.HTTPError: HTTP Error 503: Service Unavailable
HTML -
<div class="atg_store_sizePicker size-label">
<label class="atg_store_pickerLabel">
Size
</label>
<div class="fp-root fp-align" data-product-id="505316967"></div>
<!-- Check if the fit predictor is enabled for current site or not -->
<script type="text/javascript" src="/javascript/fitpredictor.js"></script>
<span id="fp-data" data-securitystatus="0" data-productid="505316967" data-customerid="678380871"
data-market="US" data-language="en" data-shippingcountry="US"
data-page="pdp" data-brand="adidas" data-productname="Men's Suede Gazelle Sneakers " data-category="BNY-mens-sneakers" data-minprice="150.0"
data-maxprice="150.0" data-color="CREAM"></span>
<input type="hidden" id="fp_allSizes" value='["6 M","7 M","7.5 M","8 M","8.5 M","9 M","9.5 M","10 M","10.5 M","11 M","11.5 M","12 M","13 M"]' />
<input type="hidden" id="fp_availableSizes" value='["6 M","7 M","7.5 M","8 M","8.5 M","9 M","9.5 M","10 M","10.5 M","11 M","11.5 M","13 M"]' />
<span class="selector">
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169686" data-onhand-quantity="1"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
6 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169693" data-onhand-quantity="1"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
7 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169785" data-onhand-quantity="1"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
7.5 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169709" data-onhand-quantity="1"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
8 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169792" data-onhand-quantity="2"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="2" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
8.5 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169716" data-onhand-quantity="5"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="5" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
9 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169808" data-onhand-quantity="3"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="3" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
9.5 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169723" data-onhand-quantity="4"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="4" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
10 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169815" data-onhand-quantity="3"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="3" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
10.5 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169730" data-onhand-quantity="3"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="3" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
11 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169822" data-onhand-quantity="1"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="1" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
11.5 M
</a>
<a class=" atg_store_oneSize disabled-size sizePicker " data-productid="505316967" data-skuid="00505053169747" data-onhand-quantity="0"
data-onorder-quantity="0" data-availabilitystatus="1001" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-currentcommerceitemid="" data-isprivate="0"
data-atp="0" data-gwp="0" data-vendorcolor="CREAM"
data-product-current-site="BNY" href="javascript:void(0)">
12 M
</a>
<a class=" atg_store_oneSize sizePicker " data-productid="505316967" data-skuid="00505053169754" data-onhand-quantity="2"
data-onorder-quantity="0" data-availabilitystatus="1000" data-sale-price="150.0"
data-list-price="150.0" data-on-sale="false" data-atp="2" data-isprivate="0"
data-expected-delivery-month="" data-gwp="0" data-vendorcolor="CREAM" data-currentcommerceitemid=""
data-product-current-site="BNY" data-is-ap-enabled-for-sku ="true" data-is-ap-enabled-for-product="true" href="javascript:void(0)">
13 M
</a>
</span>
</div>