我尝试使用Selenium(使用geckodriver)访问网站,它说我被阻止但我可以使用Firefox浏览器手动访问它。所以我比较了我的fingerpirnt的组件,唯一的区别是在Navigator对象" webdriver"被设置为" true"当我使用Selenium时。我尝试运行此代码:
git log --name-only
但它只是加载" webdriver"仍设置为" true",然后返回此消息:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
firefox_binary = '/usr/bin/firefox'
options = Options()
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().FIREFOX
# caps["pageLoadStrategy"] = "normal" # complete
caps["pageLoadStrategy"] = "eager" # interactive
injected_javascript=("Object.defineProperty(navigator, 'webdriver', { value: 'false' })")
driver = webdriver.Firefox(executable_path=r'/home/kkkk/ggecko/geckodriver', firefox_binary=firefox_binary)
driver.get('https://auth.citromail.hu/regisztracio/')
driver.execute_async_script(injected_javascript)
我做错了什么或有不同的方法来实现这个目标?
答案 0 :(得分:2)
请参阅此问题: Selenium webdriver: firefox headless inject javascript to modify browser property
它提供了一条有用的途径。
这是代码:
import os
from selenium import webdriver
options=webdriver.FirefoxOptions()
options.set_headless(True)
driver=webdriver.Firefox(options=options)
# solution found here https://stackoverflow.com/questions/17385779/how-do-i-load-a-javascript-file-into-the-dom-using-selenium
driver.execute_script("var s=window.document.createElement('script'); s.src='javascriptFirefox.js';window.document.head.appendChild(s);")
driver.get('https://auth.citromail.hu/regisztracio/')
JavaScript文件javascriptFirefox.js
// overwrite the `languages` property to use a custom getter
const setProperty = () => {
Object.defineProperty(navigator, "languages", {
get: function() {
return ["en-US", "en", "es"];
}
});
// Overwrite the `plugins` property to use a custom getter.
Object.defineProperty(navigator, 'plugins', {
get: () => [1, 2, 3, 4, 5],
});
// Pass the Webdriver test
Object.defineProperty(navigator, 'webdriver', {
get: () => false,
});
callback();
};
setProperty();