用于检查网址是否为wordpress网站的脚本

时间:2019-05-17 22:03:54

标签: php wordpress

我正在尝试创建一个简单的脚本,该脚本可以让我知道网站是否基于wordpress。这个想法是检查尝试访问其wp-admin时是否从URL获取404:“ https://www.audi.co.il/wp-admin”(由于存在而返回“ true”)。当我尝试输入一个不存在的URL(例如“ https://www.audi.co.il/wp-blablabla”)时,即使chrome,php仍会返回“ true”,即使将chrome粘贴到其地址栏时,它也会在网络标签上返回404。为什么会这样,如何解决?谢谢! 这是代码(基于其他用户的答案):

Traceback (most recent call last):
File "Script.py", line 65, in <module>
    driver.find_element_by_xpath("//a[contains(@href, '')])[20]").click()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
    self._execute(Command.CLICK_ELEMENT)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotVisibleException: Message: element not interactable
(Session info: chrome=74.0.3729.131)
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Windows NT 6.1.7601 SP1 x86_64)

1 个答案:

答案 0 :(得分:1)

您可以尝试找到wp-admin页面,如果不存在该页面,则表示存在很大的变化,不是wordpress。

function isWordpress($url)
{
    $ch = curl_init();

    // set URL and other appropriate options
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER , 1 );
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);

    // grab URL and pass it to the browser
    curl_exec($ch);
    $httpStatus = curl_getinfo($ch, CURLINFO_RESPONSE_CODE);

    // close cURL resource, and free up system resources
    curl_close($ch);
    if ( $httpStatus == 200 ) {
        return true;
    }
    return false;
}

if ( isWordpress("http://www.example.com/wp-admin") ) {
    // This is wordpress
} else {
    // Not wordpress
}

由于某些Wordpress安装会保护wp-admin网址,因此此准确性可能不会达到100%。