将JavaScript函数转换为PHP函数(用于将字符串转换为HTML编码的文本)

时间:2011-11-17 04:22:02

标签: php javascript

根据此处的函数http://www.unicodetools.com/unicode/convert-to-html.php,该函数用于将字符串转换为HTML编码文本。

JavaScript是:

function a(b) {
    var c= '';

    for(i=0; i<b.length; i++) {
        if(b.charCodeAt(i)>127) {
            c += '&#' + b.charCodeAt(i) + ';'; 
        } else { 
            c += b.charAt(i); 
        }
  }

  document.forms.conversionForm.outputText.value = c;
}

我的尝试是:

function str_to_html_entity($str) {
    $output = NULL;

    for($i = 0; $i < strlen($str); $i++) {
        if(ord($str) > 127) {
            $output .= '&#' + ord($str) + ';'; 
        } else { 
            $output .= substr($str, $i); 
        }
  }

  return $output;
}

echo str_to_html_entity("Thére Àre sôme spëcial charâcters ïn thìs têxt");

我的PHP函数运行正常,但结果不符合我的预期:

我的结果:

Thére Àre sôme spëcial charâcters ïn thìs têxthére Àre sôme spëcial charâcters ïn thìs têxtére Àre sôme spëcial charâcters ïn thìs têxt�re Àre sôme spëcial charâcters ïn thìs têxtre Àre sôme spëcial charâcters ïn thìs têxte Àre sôme spëcial charâcters ïn thìs têxt Àre sôme spëcial charâcters ïn thìs têxtÀre sôme spëcial charâcters ïn thìs têxt�re sôme spëcial charâcters ïn thìs têxtre sôme spëcial charâcters ïn thìs têxte sôme spëcial charâcters ïn thìs têxt sôme spëcial charâcters ïn thìs têxtsôme spëcial charâcters ïn thìs têxtôme spëcial charâcters ïn thìs têxt�me spëcial charâcters ïn thìs têxtme spëcial charâcters ïn thìs têxte spëcial charâcters ïn thìs têxt spëcial charâcters ïn thìs têxtspëcial charâcters ïn thìs têxtpëcial charâcters ïn thìs têxtëcial charâcters ïn thìs têxt�cial charâcters ïn thìs têxtcial charâcters ïn thìs têxtial charâcters ïn thìs têxtal charâcters ïn thìs têxtl charâcters ïn thìs têxt charâcters ïn thìs têxtcharâcters ïn thìs têxtharâcters ïn thìs têxtarâcters ïn thìs têxtrâcters ïn thìs têxtâcters ïn thìs têxt�cters ïn thìs têxtcters ïn thìs têxtters ïn thìs têxters ïn thìs têxtrs ïn thìs têxts ïn thìs têxt ïn thìs têxtïn thìs têxt�n thìs têxtn thìs têxt thìs têxtthìs têxthìs têxtìs têxt�s têxts têxt têxttêxtêxt�xtxtt

预期结果:

Th&#233;re &#192;re s&#244;me sp&#235;cial char&#226;cters &#239;n th&#236;s t&#234;xt

有人可以告诉我PHP功能有什么问题吗?

由于

更新

function str_to_html_entity($str) {
    $result = null;
    for ($i = 0, $length = mb_strlen($str, 'UTF-8'); $i < $length; $i++) {
        $character = mb_substr($str, $i, 1, 'UTF-8');
        if (strlen($character) > 1) {  // the character consists of more than 1 byte
            $character = htmlentities($character, ENT_COMPAT, 'UTF-8');
        }
        $result .= $character;
    }

  return $result;
}

echo str_to_html_entity("Thére Àre"); // Th&eacute;re &Agrave;re
echo str_to_html_entity("中"); // 中

2 个答案:

答案 0 :(得分:2)

一般而言:

因此,您无法在PHP中复制完全相同的算法。此外,在循环中,您使用整个$str而不是字符串偏移量,这是您的另一个主要问题。要使其识别Unicode,这可能是最好的方法:

$result = null;
foreach (preg_split('/./u', $str) as $character) {
    if (strlen($character) > 1) {  // the character consists of more than 1 byte
        $character = mb_convert_encoding($character, 'HTML-ENTITIES', 'UTF-8');
    }
    $result .= $character;
}

这要求字符串是UTF-8编码的。正如你所看到的,有一个很好的函数叫mb_convert_encoding,它可以一次性转义整个文本块,你实际上是在重新发明。改为使用它。

Unicode受损PCRE的替代版本:

$result = null;
for ($i = 0, $length = mb_strlen($str, 'UTF-8'); $i < $length; $i++) {
    $character = mb_substr($str, $i, 1, 'UTF-8');
    if (strlen($character) > 1) {  // the character consists of more than 1 byte
        $character = mb_convert_encoding($character, 'HTML-ENTITIES', 'UTF-8');
    }
    $result .= $character;
}

但严重的是,只需使用$str = mb_convert_encoding($str, 'HTML-ENTITIES', 'UTF-8')并完成它。不需要循环。

答案 1 :(得分:1)

你的功能有几处错误。检查我的一些修复

function str_to_html_entity($str) {
    $output = NULL;

    $lenght = strlen($str);
    for($i = 0; $i < $lenght; $i++) {
        if(ord($str[$i]) > 127) {
            $output .= '&#' . ord($str[$i]) . ';';
        } else {
            $output.= $str[$i];
        }
  }

  return $output;
}

编辑1

也可以使用

   $lenght = strlen($str);

优化