Question

在此代码中，我搜索网站以检索有关特定UPC代码的数据。但是，问题是此代码仅适用于某些URL。例如，如果URL为“627386004004”，则返回错误，但如果URL为“20313447006”，则会发生错误。为什么这只适用于某些UPC。但是，如果我只是将2个URL放在浏览器中，它们都会返回一个有效的页面。

https://www.realcanadiansuperstore.ca/search/1522603193415/page/~item/20313447006/~sort/recommended/~selected/true

https://www.realcanadiansuperstore.ca/search/1522603193415/page/~item/627386004004/~sort/recommended/~selected/true

这是我的代码：

import requests
import time
from lxml import html

upc = '627386004004'
base_url = 'https://www.realcanadiansuperstore.ca'

# Get search results
now = round(time.time())
url = base_url + 
f'/search/{now}/page/~item/{upc}/~sort/recommended/~selected/true'
response = requests.get(url)
body = response.text

# Parse search results
tree = html.fromstring(body)
if not len(tree.xpath('//div[@class="content-tile-list"]/div')) == 1:
    raise Exception('Invalid number of results')
url = base_url + tree.xpath('//div[@class="content-tile-list"]/div/div/div/div[@class="product-info"]/div/a/@href')[0]

# Get product page
res = requests.get(url)
tree = html.fromstring(res.text)
name = " ".join(tree.xpath('//h1[@class="product-name"]/text()')[1].split())

print(name)

返回的错误是异常，我在代码中测试过。

Exception: Invalid number of results

如果我遗漏任何细节或做错事请告诉我：）

编辑：使用Python 3.6

为什么只有某些UPC代码在此代码中有效

0 个答案: