有人可以帮助我使用scrapy登录到StockX.com吗?我正在尝试构建一个网络抓取工具,以从stockx抓取价格信息,但是我不熟悉Stockx登录所需的身份验证类型。我已经看过有关使用csrf令牌的教程,但是在登录时看不到表单数据中提交的任何令牌,因此,我对如何在StockX上使用scrapy验证登录名感到困惑。任何帮助将不胜感激。
当前代码:
class StockXSpider(scrapy.Spider):
name = 'sx'
# page_number = 2
start_urls = [
'https://www.stockx.com/login?
iss=https%3A%2F%2Faccounts.stockx.com%2F'
]
def parse(self, response):
return scrapy.FormRequest.from_response(response,
formdata={"username": "Gabiospi321@gmail.com", "password": "LitMari757"},
callback=self.scrape)
def scrape(self, response):
items = SxscrapyItem()
all_div_blanks = response.css('div.product-header-media')
for blanks in all_div_blanks:
product_name = response.css('div.col-md-12').xpath('//h1/text()').extract()
lowest_ask = response.css('div.en-us.stat-value.stat-small::text')[0].extract()
highest_bid = response.css('div.en-us.stat-value.stat-small::text')[1].extract()
items['product_name'] = product_name
items['lowest_ask'] = lowest_ask
items['highest_bid'] = highest_bid
yield items
答案 0 :(得分:1)
您可以改用selenium并执行以下操作以登录stockx:
使用selenium来登录,方法是找到电子邮件输入和密码输入,然后发送keys
作为登录详细信息。我不确定stockx是否具有api,但如果这样做,可能会更容易使用而不是手动登录和抓取。
from selenium import webdriver
import time
driver = webdriver.Firefox() # you could use chrome instead
driver.get("http://accounts.stockx.com/login")
time.sleep(4) # small delay before inputting login for page to load
driver.find_element_by_id("email-login").send_keys(
"myemail@email.com"
) # inputs your email
driver.find_element_by_id("password-login").send_keys(
"mypassword"
) # inputs your password
driver.find_element_by_id("btn-login").click() # clicks the login button