我正在尝试创建一个简单的脚本,该脚本可以让我知道网站是否基于wordpress。这个想法是检查尝试访问其wp-admin时是否从URL获取404:“ https://www.audi.co.il/wp-admin”(由于存在而返回“ true”)。当我尝试输入一个不存在的URL(例如“ https://www.audi.co.il/wp-blablabla”)时,即使chrome,php仍会返回“ true”,即使将chrome粘贴到其地址栏时,它也会在网络标签上返回404。为什么会这样,如何解决?谢谢! 这是代码(基于其他用户的答案):
Traceback (most recent call last):
File "Script.py", line 65, in <module>
driver.find_element_by_xpath("//a[contains(@href, '')])[20]").click()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: element not interactable
(Session info: chrome=74.0.3729.131)
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Windows NT 6.1.7601 SP1 x86_64)
答案 0 :(得分:1)
您可以尝试找到wp-admin页面,如果不存在该页面,则表示存在很大的变化,不是wordpress。
function isWordpress($url)
{
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER , 1 );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
// grab URL and pass it to the browser
curl_exec($ch);
$httpStatus = curl_getinfo($ch, CURLINFO_RESPONSE_CODE);
// close cURL resource, and free up system resources
curl_close($ch);
if ( $httpStatus == 200 ) {
return true;
}
return false;
}
if ( isWordpress("http://www.example.com/wp-admin") ) {
// This is wordpress
} else {
// Not wordpress
}
由于某些Wordpress安装会保护wp-admin网址,因此此准确性可能不会达到100%。