扩展PHP正则表达式以涵盖“srcset”和“style”属性

时间:2017-01-16 16:25:35

标签: php regex wordpress url

我创建了一个WordPress插件,根据我在http:中列出的标记和属性,将所有链接转换为protocol-relative URLs(删除https:$tag)和$attribute变量。这是该功能的一部分。为了节省空间,the rest of the code can be found here

$content_type = NULL;
# Check for 'Content-Type' headers only
foreach ( headers_list() as $header ) {
    if ( strpos( strtolower( $header ), 'content-type:' ) === 0 ) {
        $pieces = explode( ':', strtolower( $header ) );
        $content_type = trim( $pieces[1] );
        break;
    }
}
# If the content-type is 'NULL' or 'text/html', apply rewrite
if ( is_null( $content_type ) || substr( $content_type, 0, 9 ) === 'text/html' ) {
    $tag = 'a|base|div|form|iframe|img|link|meta|script|svg';
    $attribute = 'action|content|data-project-file|href|src|srcset|style';
    # If 'Protocol Relative URL' option is checked, only apply change to internal links
    if ( $this->option == 1 ) {
        # Remove protocol from home URL
        $website = preg_replace( '/https?:\/\//', '', home_url() );
        # Remove protocol from internal links
        $links = preg_replace( '/(<(' . $tag . ')([^>]*)(' . $attribute . ')=["\'])https?:\/\/' . $website . '/i', '$1//' . $website, $links );
    }
    # Else, remove protocols from all links
    else {
        $links = preg_replace( '/(<(' . $tag . ')([^>]*)(' . $attribute . ')=["\'])https?:\/\//i', '$1//', $links );
    }
}
# Return protocol relative links
return $links;

这可以按预期工作,但它不适用于这些示例:

<!-- Within the 'style' attribute -->
<div class="some-class" style='background-color:rgba(255,255,255,0);background-image:url("http://placehold.it/300x200");background-position:center center;background-repeat:no-repeat'>
<!-- Within the 'srcset' attribute -->
<img src="http://placehold.it/600x300" srcset="http://placehold.it/500 500x, http://placehold.it/100 100w">

但是,代码部分适用于这些示例。

<div class="some-class" style='background-color:rgba(255,255,255,0);background-image:url("http://placehold.it/300x200");background-position:center center;background-repeat:no-repeat'>
<img src="http://placehold.it/600x300" srcset="//placehold.it/500 500x, http://placehold.it/100 100w">

我一直在为$tag$attribute变量添加其他值,但这并没有帮助。我假设我需要更新我的正则表达式的其余部分以涵盖这两个额外的标签?或者有不同的方法来处理它,例如DOMDocument

1 个答案:

答案 0 :(得分:0)

通过执行以下操作,我能够简化代码:

$content_type = NULL;
# Check for 'Content-Type' headers only
foreach ( headers_list() as $header ) {
    if ( strpos( strtolower( $header ), 'content-type:' ) === 0 ) {
        $pieces = explode( ':', strtolower( $header ) );
        $content_type = trim( $pieces[1] );
        break;
    }
}
# If the content-type is 'NULL' or 'text/html', apply rewrite
if ( is_null( $content_type ) || substr( $content_type, 0, 9 ) === 'text/html' ) {
    # Remove protocol from home URL
    $website = $_SERVER['HTTP_HOST'];
    $links = str_replace( 'https?://' . $website, '//' . $website, $links );
    $links = preg_replace( '|https?://(.*?)|', '//$1', $links );
}
# Return protocol relative links
return $links;