Googlebot收到错误消息

时间:2010-09-14 09:49:12

标签: php detection googlebot multiple-browsers

我有以下代码作为我的多语言网站的index.php。每种可用语言都有一个子目录。

<?php


if (isset($_POST['URL']) && strlen($_POST['URL']) == 3) {
         header("location: ".$_POST[URL]);
}
else {

    function lixlpixel_get_env_var($Var)
    {
         if(empty($GLOBALS[$Var]))
         {
             $GLOBALS[$Var]=(!empty($GLOBALS['_SERVER'][$Var]))?
             $GLOBALS['_SERVER'][$Var] : (!empty($GLOBALS['HTTP_SERVER_VARS'][$Var])) ? $GLOBALS['HTTP_SERVER_VARS'][$Var]:'';
         }
    }

    function lixlpixel_detect_lang()
    {
         // Detect HTTP_ACCEPT_LANGUAGE & HTTP_USER_AGENT.
         lixlpixel_get_env_var('HTTP_ACCEPT_LANGUAGE');
         lixlpixel_get_env_var('HTTP_USER_AGENT');

         $_AL=strtolower($GLOBALS['HTTP_ACCEPT_LANGUAGE']);
         $_UA=strtolower($GLOBALS['HTTP_USER_AGENT']);

         // Try to detect Primary language if several languages are accepted.
         foreach($GLOBALS['_LANG'] as $K)
         {
             if(strpos($_AL, $K)===0)
             return $K;
         }

         // Try to detect any language if not yet detected.
         foreach($GLOBALS['_LANG'] as $K)
         {
             if(strpos($_AL, $K)!==false)
             return $K;
         }
         foreach($GLOBALS['_LANG'] as $K)
         {
             if(preg_match("/[[( ]{$K}[;,_-)]/",$_UA))
             return $K;
         }

         // Return default language if language is not yet detected.
         return $GLOBALS['_DLANG'];
    }

    // Define default language.
    $GLOBALS['_DLANG']='en';

    // Define all available languages.
    // WARNING: uncomment all available languages

    $GLOBALS['_LANG'] = array(
    'en', // english.
    'es', // spanish.
    'fr', // french.
    );

    /*
    $GLOBALS['_LANG'] = array(
    'af', // afrikaans.
    'ar', // arabic.
    'bg', // bulgarian.
    'ca', // catalan.
    'cs', // czech.
    'da', // danish.
    'de', // german.
    'el', // greek.
    'en', // english.
    'es', // spanish.
    'et', // estonian.
    'fi', // finnish.
    'fr', // french.
    'gl', // galician.
    'he', // hebrew.
    'hi', // hindi.
    'hr', // croatian.
    'hu', // hungarian.
    'id', // indonesian.
    'it', // italian.
    'ja', // japanese.
    'ko', // korean.
    'ka', // georgian.
    'lt', // lithuanian.
    'lv', // latvian.
    'ms', // malay.
    'nl', // dutch.
    'no', // norwegian.
    'pl', // polish.
    'pt', // portuguese.
    'ro', // romanian.
    'ru', // russian.
    'sk', // slovak.
    'sl', // slovenian.
    'sq', // albanian.
    'sr', // serbian.
    'sv', // swedish.
    'th', // thai.
    'tr', // turkish.
    'uk', // ukrainian.
    'zh' // chinese.
    );
    */

    // Redirect to the correct location.


    header('location: /'.lixlpixel_detect_lang());
    //header('location: http://www.your_site.com/index_'.lixlpixel_detect_lang().'.php'); // Example Implementation
    echo 'The Language detected is: '.lixlpixel_detect_lang(); // For Demonstration

}

&GT;

问题在于,虽然在用户浏览器中这种方式非常有效,但搜索引擎(如Googlebot)会引发以下错误:

    <br />
<b>Warning</b>:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: range out of order in character class at offset 12 in <b>/index.php</b> on line <b>41</b><br />
<br />
<b>Warning</b>:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: range out of order in character class at offset 12 in <b>/index.php</b> on line <b>41</b><br />
<br />
<b>Warning</b>:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: range out of order in character class at offset 12 in <b>/index.php</b> on line <b>41</b><br />
<br />
<b>Warning</b>:  Cannot modify header information - headers already sent by (output started at /index.php:41) in <b>/index.php</b> on line <b>110</b><br />
<br />
<b>Warning</b>:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: range out of order in character class at offset 12 in <b>/index.php</b> on line <b>41</b><br />
<br />
<b>Warning</b>:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: range out of order in character class at offset 12 in <b>/index.php</b> on line <b>41</b><br />
<br />
<b>Warning</b>:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: range out of order in character class at offset 12 in <b>/index.php</b> on line <b>41</b><br />
The Language detected is: en

我已经尝试过错误处理,但我不是PHP程序员,我是CF程序员,所以我真的需要一些帮助!

1 个答案:

答案 0 :(得分:2)

character class内,-表示范围。在这种情况下,_-)中的[;,_-)]被解释为范围(_)之间的每个字符)。但_(0x95)位于)(0x28)之后,因此_-)是无效范围。

如果您指的是三个字符_-),请转义-

[;,_\-)]

除此之外,Accept-Language是一个加权值列表(请参阅 q 参数)。这意味着只是特定语言标签的出现并不一定意味着它是最受欢迎的语言。可能有更优选的语言(更高 q 值)甚至根本不被接受(即q=0)。

因此,您不应仅仅查找特定语言标记的出现,而应该更好地解析列表并找到首选语言和可用语言的最佳匹配。