改进reCaptcha 2.0求解自动化脚本(Selenium)

时间:2015-08-27 12:18:22

标签: python selenium captcha recaptcha

我用硒代码编写了一个python来解决new behaviour captcha。但是缺乏完全模仿用户行为的东西:代码可以找到并点击验证码,然后谷歌设置其他图片检查enter image description here

这不容易自动化。如何在没有图片检查的情况下立即改进代码来解决验证码(让google没有暗示机器人的存在)?

reCaptcha testing ground

Python代码

from time import sleep
from random import uniform
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions as EC

# to imitate hovering 
def hover(element):  
    hov = ActionChains(driver).move_to_element(element)
    hov.perform()
# optional: adding www.hola.org proxy profile to FF (extention is installed on FF, Win 8) 
ffprofile = webdriver.FirefoxProfile()
hola_file = '/Users/Igor/AppData/Roaming/Mozilla/Firefox/Profiles/7kcqxxyd.default-1429005850374/extensions/hola/hola_firefox_ext_1.9.354_www.xpi'
ffprofile.add_extension(hola_file) 
# end of the optional part

driver = webdriver.Firefox(ffprofile) 
url='http://tarex.ru/testdir/recaptcha/recaptcha.php'

# open new tab, also optional 
driver.find_element_by_tag_name('body').send_keys(Keys.COMMAND + 't') 
driver.get(url)

recaptchaFrame = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.TAG_NAME ,'iframe'))
        )
frameName = recaptchaFrame.get_attribute('name')

# move the driver to the iFrame... 
driver.switch_to_frame(frameName)

# *************  locate CheckBox  **************
CheckBox = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID ,"recaptcha-anchor"))
        )

# *************  hover CheckBox  ***************
rand=uniform(1.0, 1.5)
print('\n\r explicit wait for ', rand , ' seconds...')
sleep(rand) 
hover(CheckBox)

# *************  click CheckBox  ***************
rand=uniform(0.5, 0.7)
print('\n\r explicit wait for ', rand , 'seconds...')
sleep(rand)
# making click on CheckBox... 
clickReturn= CheckBox.click()
print('\n\r after click on CheckBox... \n\r CheckBox click result: ' , clickReturn)

2 个答案:

答案 0 :(得分:3)

你不能这样做,我认为无论如何,当从同一个IP发出太多请求时会使用图像障碍,所以你无法绕过它,你能做的就是使用代理

答案 1 :(得分:0)

这是我的解决方案:

# -*- coding: utf-8 -*-

from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.Chrome()

driver.get(url='https://www.google.com/recaptcha/api2/demo')

# find iframe
captcha_iframe = WebDriverWait(driver, 10).until(
    ec.presence_of_element_located(
        (
            By.TAG_NAME, 'iframe'
        )
    )
)

ActionChains(driver).move_to_element(captcha_iframe).click().perform()

# click im not robot
captcha_box = WebDriverWait(driver, 10).until(
    ec.presence_of_element_located(
        (
            By.ID, 'g-recaptcha-response'
        )
    )
)

driver.execute_script("arguments[0].click()", captcha_box)