当您发送任何产品链接(来自myntra,amazon,flipkart)时,我正在创建电报bot。每当价格下降时,它将向用户发送消息,这是我的代码,用于从flipkart和myntra取消价格
import requests
from bs4 import BeautifulSoup
URL = 'https://www.myntra.com/sports-sandals/roadster/roadster-men-charcoal-grey-sports-
sandals/9024251/buy'
head = {"user_agents":'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/79.0.3945.88 Safari/537.36'}
page = requests.get(URL, headers=head)
soup = BeautifulSoup(page.content, "html.parser")
name = str(BeautifulSoup(page.content, 'html.parser')).split(".")
test_name = BeautifulSoup(page.content, 'html.parser').get_text()
if "flixcart" in name:
title = soup.find(class_={"_35KyD6"}).get_text()
price = soup.find(class_={"_1vC4OE _3qQ9m1"}).get_text()
print(title)
print(price)
if "myntra" in name:
price = soup.find(class_={"pdp-price"})
name = soup.find(class_={"pdp-name"})
#title = soup.find("div class=\"pdp-price-info\"")
print(price)
此处代码可以从flipkart中提取价格和名称,但适用于myntra 在“价格”和“名称”中,没有显示任何类型 我想获取图片中突出显示的名称
答案 0 :(得分:2)
使用Javascript
中的JSON
动态填充页面数据。但是JSON
不会通过XHR
加载。您可以在{{1}中找到JSON
,并可以使用HTML
提取JSON
并将Regex
转换为JSON
。
Dictionary
输出:
import re
import json
import requests
url = 'https://www.myntra.com/sports-sandals/roadster/roadster-men-charcoal-grey-sports-sandals/9024251/buy'
headers = {"user_agents":'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}
response = requests.get(url, headers=headers)
match = re.findall(r"<script>window.__myx = (.+?)</script>", response.text)
json_data = json.loads(match[0])
product_name = json_data['pdpData']['name']
mrp = json_data['pdpData']['price']['mrp']
selling_price = json_data['pdpData']['price']['discounted']
print('ProductName:', product_name)
print('MRP:', mrp)
print('SellingPrice:', selling_price)