Question

如果我在代码下面运行，我什么也得不到。你能否找出我的代码中的问题。

import requests
import sys
from bs4 import BeautifulSoup
r = requests.get('https://www.flipkart.com/search?q=laptop')
content = r.content.decode(encoding='UTF-8')
soup = BeautifulSoup(r.content.decode(encoding='UTF-8'), "lxml")
reviews = soup.find_all('div', {"class": "_3wU53n"})
print(reviews)

期待结果：

HP Core i3 6th Gen - (4 GB/1 TB HDD/DOS) 1AC75PA#ACJ 15-BE012TU Notebook

Answer 1

您的代码对我来说似乎很完美。我运行了您的确切代码，并进行了一些操作，我能够获得所需的结果：

import requests
import sys
from bs4 import BeautifulSoup
r = requests.get('https://www.flipkart.com/search?q=laptop')
content = r.content.decode(encoding='UTF-8')
soup = BeautifulSoup(r.content.decode(encoding='UTF-8'), "lxml")
reviews = soup.find_all('div', {"class": "_3wU53n"})
for item in reviews:
    print(item.text)

输出：

HP 15q Core i3第7代-（8 GB / 1 TB HDD / DOS）15q-bu038TU笔记本电脑

Dell Vostro 15 3000 Core i5第8代-（8 GB / 1 TB HDD / Windows 10 Home / 2 GB图形）3578笔记本电脑...

... 等等

尝试在其他任何地方运行代码，如果您对ping的请求过多，可能会被flipkart阻止您的IP

Answer 2

import bs4  
from urllib.request import urlopen as uReq  
from bs4 import BeautifulSoup as soup  
myurl = "https://www.flipkart.com/search?q=iphone&marketplace=FLIPKART&otracker=start&as-show=on&as=off"  
uclient = uReq(myurl)       
page_html = uclient.read()  
uclient.close()  
psoup = soup(page_html, "html.parser")  
container= psoup.findAll("div",{"class":"bhgxx2 col-12-12"})  
#container variable contains the html of product title which is store in div tag and class is bhgxx2   
y=[]  
Y is array for Store all Product Titles   
for Product in container:  
 ProductTitle = Product.findAll("div",{"class":"_3wU53n"})  
 for i in ProductTitle:  
    print(i.text)  
#All product Title will appear on Flipkart page according to url

如何使用网络抓取获取flipkart产品数据

2 个答案: