我正在尝试在网上刮除grofer和bigbasket信息,但是我在findAll()函数方面遇到了麻烦。当我使用len(imgList)时,长度总是返回0。它总是显示空列表如何解决?有人可以帮我吗?我在grofer中得到staus代码403
from bs4 import BeautifulSoup
url = 'https://grofers.com/cn/grocery-staples/cid/16'
driver = webdriver.Chrome(r'C:\Users\HP\data\chromedriver.exe')
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
data = soup.findAll('plp-product__name')
print(data)
from bs4 import BeautifulSoup
response = requests.get('https://grofers.com/cn/grocery-staples/cid/16')
response
content = response.content
data = BeautifulSoup(content,'html5lib')
read = data.findAll('plp-product__name ')
read```
在输出中我得到:
[]
答案 0 :(得分:0)
您还没有加入
import numpy as np
from PIL import Image
import os
new_dir = "dta_npy"
directory = r"C:\Desktop\Université_2019_2020\CoursS2_Mosef\Stage\Data\Grand_Leez\shp\imagettes"
Data_dir = os.path.join(directory, new_dir)
os.makedirs(Data_dir)
print("Directory '%s' created" %Data_dir)
Categories = ["Bouleau_tif","Chene_tif", "Erable_tif", "Frene_tif", "Peuplier_tif"]
for categorie in Categories:
path = os.path.join(directory,categorie) #path for each species
for img in os.listdir(path):
im = Image.open(os.path.join(path,img)) #load an image file
imarray = np.array(im) # convert it to a matrix
imarray = np.delete(imarray, 3, axis=2)
unique_name=img
unique_name = unique_name.split(".")
unique_name = unique_name[0]
np.save(Data_dir+"/"+unique_name, imarray)
尝试
from selenium import webdriver
driver = webdriver.Chrome(executable_path=r'C:\Users\HP\data\chromedriver.exe')
或者
data = soup.select('div.plp-product__name ')
请注意,正确的方法是data = soup.find_all("div",class_="plp-product__name")
,而不是find_all
,因为bs4库中已弃用该方法。