PHP相当于Python的`urljoin`

时间:2016-10-15 01:03:03

标签: php

从基础URL和潜在相对路径构建URL的PHP​​等价物是什么? Python提供了urlparse.urljoin,但在PHP中似乎没有任何标准实现。

我发现最接近的是人们建议使用parse_url,然后从部分重建URL,但实现这一点的实现通常会使协议相关链接出错(例如,//example.com/foo变成http://example.com/foohttps://example.com/foo,继承基本URL的协议),它也不容易处理像父目录链接这样的事情。以下是urlparse.urljoin中正常工作的示例:

>>> from urlparse import urljoin
>>> urljoin('http://example.com/some/directory/filepart', 'foo.jpg')
'http://example.com/some/directory/foo.jpg'
>>> urljoin('http://example.com/some/directory/', 'foo.jpg')
'http://example.com/some/directory/foo.jpg'
>>> urljoin('http://example.com/some/directory/', '../foo.jpg')
'http://example.com/some/foo.jpg'
>>> urljoin('http://example.com/some/directory/', '/foo.jpg')
'http://example.com/foo.jpg'
>>> urljoin('http://example.com/some/directory/', '//images.example.com/bar.jpg')
'http://images.example.com/bar.jpg'
>>> urljoin('https://example.com/some/directory/', '//images.example.com/bar.jpg')
'https://images.example.com/bar.jpg'
>>> urljoin('ftp://example.com/some/directory/', '//images.example.com/bar.jpg') 
'ftp://images.example.com/bar.jpg'
>>> urljoin('http://example.com:8080/some/directory/', '//images.example.com/bar.jpg')
'http://images.example.com/bar.jpg'

是否有一种在PHP中实现相同的惯用方法,或者一个备受好评的简单库或实现,实际上所有这些案例都是正确的?

1 个答案:

答案 0 :(得分:2)

因为显然需要此功能,并且没有任何随机脚本覆盖所有基础,我已经开始project on Github尝试正确执行。

urljoin()的实施目前如下:

function urljoin($base, $rel) {
    $pbase = parse_url($base);
    $prel = parse_url($rel);

    $merged = array_merge($pbase, $prel);
    if ($prel['path'][0] != '/') {
        // Relative path
        $dir = preg_replace('@/[^/]*$@', '', $pbase['path']);
        $merged['path'] = $dir . '/' . $prel['path'];
    }

    // Get the path components, and remove the initial empty one
    $pathParts = explode('/', $merged['path']);
    array_shift($pathParts);

    $path = [];
    $prevPart = '';
    foreach ($pathParts as $part) {
        if ($part == '..' && count($path) > 0) {
            // Cancel out the parent directory (if there's a parent to cancel)
            $parent = array_pop($path);
            // But if it was also a parent directory, leave it in
            if ($parent == '..') {
                array_push($path, $parent);
                array_push($path, $part);
            }
        } else if ($prevPart != '' || ($part != '.' && $part != '')) {
            // Don't include empty or current-directory components
            if ($part == '.') {
                $part = '';
            }
            array_push($path, $part);
        }
        $prevPart = $part;
    }
    $merged['path'] = '/' . implode('/', $path);

    $ret = '';
    if (isset($merged['scheme'])) {
        $ret .= $merged['scheme'] . ':';
    }

    if (isset($merged['scheme']) || isset($merged['host'])) {
        $ret .= '//';
    }

    if (isset($prel['host'])) {
        $hostSource = $prel;
    } else {
        $hostSource = $pbase;
    }

    // username, password, and port are associated with the hostname, not merged
    if (isset($hostSource['host'])) {
        if (isset($hostSource['user'])) {
            $ret .= $hostSource['user'];
            if (isset($hostSource['pass'])) {
                $ret .= ':' . $hostSource['pass'];
            }
            $ret .= '@';
        }
        $ret .= $hostSource['host'];
        if (isset($hostSource['port'])) {
            $ret .= ':' . $hostSource['port'];
        }
    }

    if (isset($merged['path'])) {
        $ret .= $merged['path'];
    }

    if (isset($prel['query'])) {
        $ret .= '?' . $prel['query'];
    }

    if (isset($prel['fragment'])) {
        $ret .= '#' . $prel['fragment'];
    }


    return $ret;
}

此功能将正确处理用户,密码,端口号,查询字符串,锚点,甚至file:/// URL(这似乎是此类现有功能中的常见缺陷)。