我的目标是从页面获取所有项目。我只得到了25个中的前10个。我认为这与桌子有关,我认为它是某种类型的小部件?我是初学者,还在学习基础知识。
import mechanize,time
from bs4 import BeautifulSoup
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [("User-agent", "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13")]
sign_in = br.open('https://sellercentral.amazon.com/gp/homepage.html?')
br.select_form(name="signinWidget")
br["username"] = 'spam'
br["password"] = 'eggs'
logged_in = br.submit()
orders_html = br.open("https://sellercentral.amazon.com/hz/inventory/ref=ag_invmgr_dnav_xx_?tbla_myitable=sort:{%22sortOrder%22%3A%22DESCENDING%22%2C%22sortedColumnId%22%3A%22date%22};search:;pagination:1;")
print('Login complete...')
time.sleep(5)
soup = BeautifulSoup(orders_html,'html.parser')
partNums = soup.find_all('span', {'class': 'mt-text-content mt-table-main'})
print(partNums)
for part in partNums:
print(part.text)
print('Process Complete.')
答案 0 :(得分:0)
您可以按br.response().read()
orders_page = br.open("https://sellercentral.amazon.com/hz/inventory/ref=ag_invmgr_dnav_xx_?tbla_myitable=sort:{%22sortOrder%22%3A%22DESCENDING%22%2C%22sortedColumnId%22%3A%22date%22};search:;pagination:1;") # loads page
orders_html = br.response().read() # saves page source