浏览器无法读取包含特殊字符的文件名

时间:2013-10-18 07:31:02

标签: php wordpress hosting

我有一个文件名为Chu Thái.jpg的图片。将其上传到媒体库时,托管中的文件名已重命名为Chu-Thái.jpg,但图片的路径与文件名不同:http://bem.vn/httq/wp-content/uploads/sites/2/2013/10/Chu-Thái.jpg

因此,当将网址复制到浏览器时,它表示在此服务器上找不到该文件。

The requested URL /wp-head/wp-content/uploads/sites/2/2013/10/Chu-Thái-150x150.jpg was not found on this server.

我想知道Wordpress或我的托管引起的问题是怎么回事?

1 个答案:

答案 0 :(得分:1)

问题是您不应上传包含特殊字符的文件。我在plugin of mine中使用的是过滤器sanitize_file_name

我最终提取并调整了this plugin中的3个函数,以便对上传的文件名进行全面清理,以免出现此类错误:

add_filter( 'sanitize_file_name', 't5_sanitize_filename', 10 );

/**
 * Clean up uploaded file names
 * 
 * Sanitization test done with the filename:
 * ÄäÆæÀàÁáÂâÃãÅåªₐāĆćÇçÐđÈèÉéÊêËëₑƒğĞÌìÍíÎîÏïīıÑñⁿÒòÓóÔôÕõØøₒÖöŒœßŠšşŞ™ÙùÚúÛûÜüÝýÿŽž¢€‰№$℃°C℉°F⁰¹²³⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉±×₊₌⁼⁻₋–—‑․‥…‧.png
 * @author toscho
 * @url    https://github.com/toscho/Germanix-WordPress-Plugin
 */
function t5_sanitize_filename( $filename )
{
    $filename    = html_entity_decode( $filename, ENT_QUOTES, 'utf-8' );
    $filename    = t5_translit( $filename );
    $filename    = t5_lower_ascii( $filename );
    $filename    = t5_remove_doubles( $filename );
    return $filename;
}

/**
 * Converts uppercase characters to lowercase and removes the rest.
 * https://github.com/toscho/Germanix-WordPress-Plugin
 *
 * @uses   apply_filters( 'germanix_lower_ascii_regex' )
 * @param  string $str Input string
 * @return string
 */
function t5_lower_ascii( $str )
{
    $str     = strtolower( $str );
    $regex   = array(
        'pattern'        => '~([^a-z\d_.-])~'
        , 'replacement'  => ''
    );
    // Leave underscores, otherwise the taxonomy tag cloud in the
    // backend won’t work anymore.
    return preg_replace( $regex['pattern'], $regex['replacement'], $str );
}

/**
 * Reduces repeated meta characters (-=+.) to one.
 * https://github.com/toscho/Germanix-WordPress-Plugin
 *
 * @uses   apply_filters( 'germanix_remove_doubles_regex' )
 * @param  string $str Input string
 * @return string
 */
function t5_remove_doubles( $str )
{
    $regex = apply_filters(
            'germanix_remove_doubles_regex'
            , array(
                'pattern'        => '~([=+.-])\\1+~'
                , 'replacement'  => "\\1"
            )
    );
    return preg_replace( $regex['pattern'], $regex['replacement'], $str );
}    

/**
 * Replaces non ASCII chars.
 * https://github.com/toscho/Germanix-WordPress-Plugin
 *
 * wp-includes/formatting.php#L531 is unfortunately completely inappropriate.
 * Modified version of Heiko Rabe’s code.
 *
 * @author Heiko Rabe http://code-styling.de
 * @link   http://www.code-styling.de/?p=574
 * @param  string $str
 * @return string
 */
function t5_translit( $str )
{
    $utf8 = array(
        'Ä'  => 'Ae'
        , 'ä'    => 'ae'
        , 'Æ'    => 'Ae'
        , 'æ'    => 'ae'
        , 'À'    => 'A'
        , 'à'    => 'a'
        , 'Á'    => 'A'
        , 'á'    => 'a'
        , 'Â'    => 'A'
        , 'â'    => 'a'
        , 'Ã'    => 'A'
        , 'ã'    => 'a'
        , 'Å'    => 'A'
        , 'å'    => 'a'
        , 'ª'    => 'a'
        , 'ₐ'    => 'a'
        , 'ā'    => 'a'
        , 'Ć'    => 'C'
        , 'ć'    => 'c'
        , 'Ç'    => 'C'
        , 'ç'    => 'c'
        , 'Ð'    => 'D'
        , 'đ'    => 'd'
        , 'È'    => 'E'
        , 'è'    => 'e'
        , 'É'    => 'E'
        , 'é'    => 'e'
        , 'Ê'    => 'E'
        , 'ê'    => 'e'
        , 'Ë'    => 'E'
        , 'ë'    => 'e'
        , 'ₑ'    => 'e'
        , 'ƒ'    => 'f'
        , 'ğ'    => 'g'
        , 'Ğ'    => 'G'
        , 'Ì'    => 'I'
        , 'ì'    => 'i'
        , 'Í'    => 'I'
        , 'í'    => 'i'
        , 'Î'    => 'I'
        , 'î'    => 'i'
        , 'Ï'    => 'Ii'
        , 'ï'    => 'ii'
        , 'ī'    => 'i'
        , 'ı'    => 'i'
        , 'I'    => 'I' // turkish, correct?
        , 'Ñ'    => 'N'
        , 'ñ'    => 'n'
        , 'ⁿ'    => 'n'
        , 'Ò'    => 'O'
        , 'ò'    => 'o'
        , 'Ó'    => 'O'
        , 'ó'    => 'o'
        , 'Ô'    => 'O'
        , 'ô'    => 'o'
        , 'Õ'    => 'O'
        , 'õ'    => 'o'
        , 'Ø'    => 'O'
        , 'ø'    => 'o'
        , 'ₒ'    => 'o'
        , 'Ö'    => 'Oe'
        , 'ö'    => 'oe'
        , 'Œ'    => 'Oe'
        , 'œ'    => 'oe'
        , 'ß'    => 'ss'
        , 'Š'    => 'S'
        , 'š'    => 's'
        , 'ş'    => 's'
        , 'Ş'    => 'S'
        , '™'    => 'TM'
        , 'Ù'    => 'U'
        , 'ù'    => 'u'
        , 'Ú'    => 'U'
        , 'ú'    => 'u'
        , 'Û'    => 'U'
        , 'û'    => 'u'
        , 'Ü'    => 'Ue'
        , 'ü'    => 'ue'
        , 'Ý'    => 'Y'
        , 'ý'    => 'y'
        , 'ÿ'    => 'y'
        , 'Ž'    => 'Z'
        , 'ž'    => 'z'
        // misc
        , '¢'    => 'Cent'
        , '€'    => 'Euro'
        , '‰'    => 'promille'
        , '№'    => 'Nr'
        , '$'    => 'Dollar'
        , '℃'    => 'Grad Celsius'
        , '°C' => 'Grad Celsius'
        , '℉'    => 'Grad Fahrenheit'
        , '°F' => 'Grad Fahrenheit'
        // Superscripts
        , '⁰'    => '0'
        , '¹'    => '1'
        , '²'    => '2'
        , '³'    => '3'
        , '⁴'    => '4'
        , '⁵'    => '5'
        , '⁶'    => '6'
        , '⁷'    => '7'
        , '⁸'    => '8'
        , '⁹'    => '9'
        // Subscripts
        , '₀'    => '0'
        , '₁'    => '1'
        , '₂'    => '2'
        , '₃'    => '3'
        , '₄'    => '4'
        , '₅'    => '5'
        , '₆'    => '6'
        , '₇'    => '7'
        , '₈'    => '8'
        , '₉'    => '9'
        // Operators, punctuation
        , '±'    => 'plusminus'
        , '×'    => 'x'
        , '₊'    => 'plus'
        , '₌'    => '='
        , '⁼'    => '='
        , '⁻'    => '-' // sup minus
        , '₋'    => '-' // sub minus
        , '–'    => '-' // ndash
        , '—'    => '-' // mdash
        , '‑'    => '-' // non breaking hyphen
        , '․'    => '.' // one dot leader
        , '‥'    => '..'  // two dot leader
        , '…'    => '...'  // ellipsis
        , '‧'    => '.' // hyphenation point
        , ' '    => '-'   // nobreak space
        , ' '    => '-'   // normal space
    );

    $str = strtr( $str, $utf8 );
    return trim( $str, '-' );
}