为什么只有某些UPC代码在此代码中有效

时间:2018-04-01 18:09:37

标签: python web-scraping barcode

在此代码中,我搜索网站以检索有关特定UPC代码的数据。但是,问题是此代码仅适用于某些URL。例如,如果URL为“627386004004”,则返回错误,但如果URL为“20313447006”,则会发生错误。为什么这只适用于某些UPC。但是,如果我只是将2个URL放在浏览器中,它们都会返回一个有效的页面。

https://www.realcanadiansuperstore.ca/search/1522603193415/page/~item/20313447006/~sort/recommended/~selected/true

https://www.realcanadiansuperstore.ca/search/1522603193415/page/~item/627386004004/~sort/recommended/~selected/true

这是我的代码:

import requests
import time
from lxml import html

upc = '627386004004'
base_url = 'https://www.realcanadiansuperstore.ca'

# Get search results
now = round(time.time())
url = base_url + 
f'/search/{now}/page/~item/{upc}/~sort/recommended/~selected/true'
response = requests.get(url)
body = response.text

# Parse search results
tree = html.fromstring(body)
if not len(tree.xpath('//div[@class="content-tile-list"]/div')) == 1:
    raise Exception('Invalid number of results')
url = base_url + tree.xpath('//div[@class="content-tile-list"]/div/div/div/div[@class="product-info"]/div/a/@href')[0]

# Get product page
res = requests.get(url)
tree = html.fromstring(res.text)
name = " ".join(tree.xpath('//h1[@class="product-name"]/text()')[1].split())

print(name)

返回的错误是异常,我在代码中测试过。

Exception: Invalid number of results

如果我遗漏任何细节或做错事请告诉我:)

编辑:使用Python 3.6

0 个答案:

没有答案