实际上,我需要做一个有关机器学习的项目。在那我要训练很多图像。我搜索了此问题,但没有成功。 谁能帮我解决这个问题。预先感谢。
答案 0 :(得分:1)
我用Google图片下载了使用硒的图片。这只是一种基本方法。
from selenium import webdriver
import time
import urllib.request
import os
from selenium.webdriver.common.keys import Keys
browser = webdriver.Chrome("path\\to\\the\\webdriverFile")
browser.get("https://www.google.com")
search = browser.find_element_by_name(‘q’)
search.send_keys(key_words,Keys.ENTER) # use required key_words to download images
elem = browser.find_element_by_link_text(‘Images’)
elem.get_attribute(‘href’)
elem.click()
value = 0
for i in range(20):
browser.execute_script(“scrollBy(“+ str(value) +”,+1000);”)
value += 1000
time.sleep(3)
elem1 = browser.find_element_by_id(‘islmp’)
sub = elem1.find_elements_by_tag_name(“img”)
try:
os.mkdir(‘downloads’)
except FileExistsError:
pass
count = 0
for i in sub:
src = i.get_attribute('src')
try:
if src != None:
src = str(src)
print(src)
count+=1
urllib.request.urlretrieve(src,
os.path.join('downloads','image'+str(count)+'.jpg'))
else:
raise TypeError
except TypeError:
print('fail')
if count == required_images_number: ## use number as required
break
check this详细说明。
下载驱动程序here
答案 1 :(得分:0)
给我的提示是:使用图片API。这是我的最爱:Bing Image Search API
以下是Send search queries using the REST API and Python中的文本。
首先,将subscription_key设置为Bing API服务的有效订阅密钥。
Python
subscription_key = None
assert subscription_key
接下来,确认search_url端点正确。在撰写本文时,Bing搜索API仅使用一个端点。如果遇到授权错误,请针对Azure仪表板中的Bing搜索终结点再次检查此值。 Python
search_url = "https://api.cognitive.microsoft.com/bing/v7.0/images/search"
设置search_term以查找小狗的图像。 Python
search_term = "puppies"
以下块使用Python中的请求库来调出Bing搜索API并将结果作为JSON对象返回。观察到我们通过headers字典传递了API密钥,并通过params字典传递了搜索词。要查看可用于过滤搜索结果的选项的完整列表,请参阅REST API文档。
Python
import requests
headers = {"Ocp-Apim-Subscription-Key" : subscription_key}
params = {"q": search_term, "license": "public", "imageType": "photo"}
response = requests.get(search_url, headers=headers, params=params)
response.raise_for_status()
search_results = response.json()
search_results对象包含实际图像以及丰富的元数据(如相关项目)。例如,以下代码行可以提取前16个结果的缩略图URL。 Python
thumbnail_urls = [img["thumbnailUrl"] for img in search_results["value"][:16]]
然后使用PIL库下载缩略图图像,并使用matplotlib库将其渲染到$ 4 \ times 4 $网格上。
Python
%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image
from io import BytesIO
f, axes = plt.subplots(4, 4)
for i in range(4):
for j in range(4):
image_data = requests.get(thumbnail_urls[i+4*j])
image_data.raise_for_status()
image = Image.open(BytesIO(image_data.content))
axes[i][j].imshow(image)
axes[i][j].axis("off")
plt.show()
来自Bing Image Search API的响应以JSON的形式返回。该示例响应已被截断以显示单个结果。
JSON
{
"_type":"Images",
"instrumentation":{
"_type":"ResponseInstrumentation"
},
"readLink":"images\/search?q=tropical ocean",
"webSearchUrl":"https:\/\/www.bing.com\/images\/search?q=tropical ocean&FORM=OIIARP",
"totalEstimatedMatches":842,
"nextOffset":47,
"value":[
{
"webSearchUrl":"https:\/\/www.bing.com\/images\/search?view=detailv2&FORM=OIIRPO&q=tropical+ocean&id=8607ACDACB243BDEA7E1EF78127DA931E680E3A5&simid=608027248313960152",
"name":"My Life in the Ocean | The greatest WordPress.com site in ...",
"thumbnailUrl":"https:\/\/tse3.mm.bing.net\/th?id=OIP.fmwSKKmKpmZtJiBDps1kLAHaEo&pid=Api",
"datePublished":"2017-11-03T08:51:00.0000000Z",
"contentUrl":"https:\/\/mylifeintheocean.files.wordpress.com\/2012\/11\/tropical-ocean-wallpaper-1920x12003.jpg",
"hostPageUrl":"https:\/\/mylifeintheocean.wordpress.com\/",
"contentSize":"897388 B",
"encodingFormat":"jpeg",
"hostPageDisplayUrl":"https:\/\/mylifeintheocean.wordpress.com",
"width":1920,
"height":1200,
"thumbnail":{
"width":474,
"height":296
},
"imageInsightsToken":"ccid_fmwSKKmK*mid_8607ACDACB243BDEA7E1EF78127DA931E680E3A5*simid_608027248313960152*thid_OIP.fmwSKKmKpmZtJiBDps1kLAHaEo",
"insightsMetadata":{
"recipeSourcesCount":0,
"bestRepresentativeQuery":{
"text":"Tropical Beaches Desktop Wallpaper",
"displayText":"Tropical Beaches Desktop Wallpaper",
"webSearchUrl":"https:\/\/www.bing.com\/images\/search?q=Tropical+Beaches+Desktop+Wallpaper&id=8607ACDACB243BDEA7E1EF78127DA931E680E3A5&FORM=IDBQDM"
},
"pagesIncludingCount":115,
"availableSizesCount":44
},
"imageId":"8607ACDACB243BDEA7E1EF78127DA931E680E3A5",
"accentColor":"0050B2"
}
}