在没有realpath()的PHP中清理文件路径

时间:2014-01-29 03:39:14

标签: php security sanitization

有没有办法安全地清理路径输入,而不使用realpath()

目标是防止像../../../../../path/to/file

这样的恶意输入
 $handle = fopen($path . '/' . $filename, 'r');

7 个答案:

答案 0 :(得分:6)

Remove Dot Sequence algorithm中描述的RFC 3986用于在相对过程中从引用路径解释和删除特殊...完整路径段URI参考分辨率。

您也可以将此算法用于文件系统路径:

// as per RFC 3986
// @see http://tools.ietf.org/html/rfc3986#section-5.2.4
function remove_dot_segments($input) {
    // 1.  The input buffer is initialized with the now-appended path
    //     components and the output buffer is initialized to the empty
    //     string.
    $output = '';

    // 2.  While the input buffer is not empty, loop as follows:
    while ($input !== '') {
        // A.  If the input buffer begins with a prefix of "`../`" or "`./`",
        //     then remove that prefix from the input buffer; otherwise,
        if (
            ($prefix = substr($input, 0, 3)) == '../' ||
            ($prefix = substr($input, 0, 2)) == './'
           ) {
            $input = substr($input, strlen($prefix));
        } else

        // B.  if the input buffer begins with a prefix of "`/./`" or "`/.`",
        //     where "`.`" is a complete path segment, then replace that
        //     prefix with "`/`" in the input buffer; otherwise,
        if (
            ($prefix = substr($input, 0, 3)) == '/./' ||
            ($prefix = $input) == '/.'
           ) {
            $input = '/' . substr($input, strlen($prefix));
        } else

        // C.  if the input buffer begins with a prefix of "/../" or "/..",
        //     where "`..`" is a complete path segment, then replace that
        //     prefix with "`/`" in the input buffer and remove the last
        //     segment and its preceding "/" (if any) from the output
        //     buffer; otherwise,
        if (
            ($prefix = substr($input, 0, 4)) == '/../' ||
            ($prefix = $input) == '/..'
           ) {
            $input = '/' . substr($input, strlen($prefix));
            $output = substr($output, 0, strrpos($output, '/'));
        } else

        // D.  if the input buffer consists only of "." or "..", then remove
        //     that from the input buffer; otherwise,
        if ($input == '.' || $input == '..') {
            $input = '';
        } else

        // E.  move the first path segment in the input buffer to the end of
        //     the output buffer, including the initial "/" character (if
        //     any) and any subsequent characters up to, but not including,
        //     the next "/" character or the end of the input buffer.
        {
            $pos = strpos($input, '/');
            if ($pos === 0) $pos = strpos($input, '/', $pos+1);
            if ($pos === false) $pos = strlen($input);
            $output .= substr($input, 0, $pos);
            $input = (string) substr($input, $pos);
        }
    }

    // 3.  Finally, the output buffer is returned as the result of remove_dot_segments.
    return $output;
}

答案 1 :(得分:5)

不确定为什么您不想使用realpath,但路径名称清理是一个非常简单的概念,大致如下:

  • 如果路径是相对的(不是以/开头),请在前面添加当前工作目录和/,使其成为绝对路径。
  • 将一个以上/的所有序列替换为一个(a)
  • 将所有/./替换为/
  • 如果最后删除/.
  • /anything/../替换为/
  • 如果最后删除/anything/..

在这种情况下,文字anything表示不是/的最长字符序列。

请注意,这些规则应该持续应用,直到它们都不会导致更改为止。换句话说,做所有六个(一次通过)。如果字符串改变了,那么回去再做六次(另一次通过)。继续这样做,直到字符串与刚执行的传递之前相同。

完成这些步骤后,您将拥有可以检查有效模式的规范路径名称。很可能是任何不以../开头的东西(换句话说,它不会试图超越起点。可能还有其他规则想要应用,但这不在此范围之内问题


(a)如果您正在使用一个系统,该路径将路径开头的//视为特殊路径,请确保在该系统中替换多个/个字符。从两个开始。这是POSIX允许(但不强制)对倍数进行特殊处理的唯一地方,在所有其他情况下,多个/字符相当于一个。

答案 2 :(得分:2)

以下函数规范化URI的文件系统路径和路径组件。它比Gumbo's RFC implementation快。

function canonicalizePath($path)
{
    $path = explode('/', $path);
    $stack = array();
    foreach ($path as $seg) {
        if ($seg == '..') {
            // Ignore this segment, remove last segment from stack
            array_pop($stack);
            continue;
        }

        if ($seg == '.') {
            // Ignore this segment
            continue;
        }

        $stack[] = $seg;
    }

    return implode('/', $stack);
}

备注

  • 它不会删除多个/的序列,因为这不符合RFC 3986
  • 显然,这不适用于..\backslash\paths
  • 我不确定此功能是否100%安全,但我无法提出损害其输出的输入。

答案 3 :(得分:2)

由于你只要求消毒,也许你需要的只是一个“狡猾的路径上的失败”的事情。如果通常在您的路径输入中没有任何 ../../ stuff /../ like / this ,您只需要检查:

function isTricky($p) {
    if(strpos("/$p/","/../")===false) return false;
    return true;
}

或只是

function isTricky($p) {return strpos("-/$p/","/../");}

这种快速而肮脏的方式可以阻止任何向后移动,在大多数情况下这就足够了。 (第二个版本返回非零而不是true但是嘿,为什么不呢!...破折号是字符串索引0的黑客。)

旁注:还记得斜杠和反斜杠 - 我建议先将背面转换为简单的斜杠。但这取决于平台。

答案 4 :(得分:0)

由于上述功能对我不起作用(或者说很长),我尝试了自己的代码:

function clean_path( $A_path="", $A_echo=false )
{
    // IF YOU WANT TO LEAN CODE, KILL ALL "if" LINES and $A_echo in ARGS
    $_p                            = func_get_args();
    // HOW IT WORKS:
    // REMOVING EMPTY ELEMENTS AT THE END ALLOWS FOR "BUFFERS" AND HANDELLING START & END SPEC. SEQUENCES
    // BLANK ELEMENTS AT START & END MAKE SURE WE COVER SPECIALS AT BEGIN & END
    // REPLACING ":" AGAINST "://" MAKES AN EMPTY ELEMENT TO ALLOW FOR CORRECT x:/../<path> USE (which, in principle is faulty)

    // 1.) "normalize" TO "slashed" AND MAKE SOME SPECIALS, ALSO DUMMY ELEMENTS AT BEGIN & END 
        $_s                        = array( "\\", ":", ":./", ":../");
        $_r                        = array( "/", "://", ":/", ":/" );
        $_p['sr']                = "/" . str_replace( $_s, $_r, $_p[0] ) . "/";
        $_p['arr']                = explode('/', $_p['sr'] );
                                                                                if ( $A_echo ) $_p['arr1']    = $_p['arr'];
    // 2.) GET KEYS OF ".." ELEMENTS, REMOVE THEM AND THE ONE BEFORE (!) AS THAT MEANS "UP" AND THAT DISABLES STEP BEFORE
        $_p['pp']                = array_keys( $_p['arr'], '..' );
        foreach($_p['pp'] as $_pos )
        {
            $_p['arr'][ $_pos-1 ] = $_p['arr'][ $_pos ] ="";
        }
                                                                                if ( $A_echo ) $_p['arr2']    = $_p['arr'];
    // 3.) REMOVE ALL "/./" PARTS AS THEY ARE SIMPLY OVERFLUENT
        $_p['p']                = array_keys( $_p['arr'], '.' );
        foreach($_p['p'] as $_pos )
        {
            unset( $_p['arr'][ $_pos ] );
        }
                                                                                if ( $A_echo ) $_p['arr3']    = $_p['arr'];
    // 4.) CLEAN OUT EMPTY ONES INCLUDING OUR DUMMIES
        $_p['arr']                = array_filter( $_p['arr'] );
    // 5) MAKE FINAL STRING
        $_p['clean']            = implode( DIRECTORY_SEPARATOR, $_p['arr'] );
                                                                                if ($A_echo){ echo "arr=="; print_R( $_p  ); };
    return $_p['clean'];    
}

答案 5 :(得分:0)

我更喜欢内爆/爆炸解决方案:

public function sanitize(string $path = null, string $separator = DIRECTORY_SEPARATOR) : string
{
    $pathArray = explode($separator, $path);
    foreach ($pathArray as $key => $value)
    {
        if ($value === '.' || $value === '..')
        {
            $pathArray[$key] = null;
        }
    }
    return implode($separator, array_map('trim', array_filter($pathArray)));
}

以前的版本是这样的:

public function sanitize(string $path = null, string $separator = DIRECTORY_SEPARATOR) : string
{
    $output = str_replace(
    [
        ' ',
        '..',
    ], null, $path);
    $output = preg_replace('~' . $separator . '+~', $separator, $output);
    $output = ltrim($output, '.');
    $output = trim($output, $separator);
    return $output;
}

两者均已针对this数据提供者进行了成功测试。享受吧!

答案 6 :(得分:-2)

简单形式:

$filename = str_replace('..', '', $filename);

if (file_exists($path . '/' . $filename)) {
    $handle = fopen($path . '/' . $filename, 'r');
}

复杂形式(来自here):

function canonicalize($address)
{
    $address = explode('/', $address);
    $keys = array_keys($address, '..');

    foreach($keys AS $keypos => $key)
    {
        array_splice($address, $key - ($keypos * 2 + 1), 2);
    }

    $address = implode('/', $address);
    $address = str_replace('./', '', $address);
    return $address;
}
echo canonicalize('/dir1/../dir2/'); // returning /dir2/