我可以通过编程方式向下滚动来扩展Google Image搜索屏幕吗?

时间:2014-08-09 10:06:33

标签: python web-scraping google-image-search

我尝试从谷歌中删除一些图片,但这次向下滚动扩展网站限制我只下载一定数量的图片。有没有办法模仿python代码?例如,如果可能,Machanize可能会在这种情况下使用。

因此,我需要模拟Google图片搜索的向下滚动扩展,以增加返回结果的数量,并将图片网址废弃。

1 个答案:

答案 0 :(得分:3)

这可能会很快让你被禁止,但我不确定。这需要BeautifulSoup并请求。

import requests
from bs4 import BeautifulSoup

s = requests.session()
s.headers.update({"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"})

URL = "https://www.google.dk/search"
images = []

def get_images(query, start):
    screen_width = 1920
    screen_height = 1080
    params = {
        "q": query,
        "sa": "X",
        "biw": screen_width,
        "bih": screen_height,
        "tbm": "isch",
        "ijn": start/100,
        "start": start,
        #"ei": "" - This seems like a unique ID, you might want to use it to avoid getting banned. But you probably still are.
    }

    request = s.get(URL, params=params)
    bs = BeautifulSoup(request.text)

    for img in bs.findAll("div", {"class": "rg_di"}):
        images.append(img.find("img").attrs['data-src'])


#Will get 400 images.
for x in range(0, 5):
    get_images("cats", x*100)

for x in images:
    print x