我很好奇最有效的字符串转换方法是什么。给定n个输入字符串和一组翻译,一般来说哪种方法效率最高?我目前使用strtr()
,但已经测试了各种循环方法,str_replace()
带有数组等。strtr()
方法在我的系统上基准最快,取决于翻译,但我是好奇,如果有更快的方法,我还没有想到。
如果它是相关的,我的特定用例涉及将2字节字符串转换为终端的ANSI颜色序列。例如:
// In practice, the number of translations is much greater than one...
$out = strtr("|rThis should be red", array('|r' => "\033[31m"));
答案 0 :(得分:3)
对于简单替换,strtr
似乎更快,但是当您使用许多搜索字符串进行复杂替换时,str_replace
似乎有优势。
答案 1 :(得分:1)
我为这两个功能的个人需求制定了一个微不足道的基准。目标是改变小写字母' e'大写' E'。
<?php
$stime = time();
for ($i = 0; $i < 1000000; $i++) {
str_replace('e', 'E', "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non risus. Suspendisse lectus tortor, dignissim sit amet, adipiscing nec, ultricies sed, dolor. Cras elementum ultrices diam. Maecenas ligula massa, varius a, semper congue, euismod non, mi. Proin porttitor, orci nec nonummy molestie, enim est eleifend mi, non fermentum diam nisl sit amet erat. Duis semper. Duis arcu massa, scelerisque vitae, consequat in, pretium a, enim. Pellentesque congue. Ut in risus volutpat libero pharetra tempor. Cras vestibulum bibendum augue. Praesent egestas leo in pede. Praesent blandit odio eu enim. Pellentesque sed dui ut augue blandit sodales. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Aliquam nibh. Mauris ac mauris sed pede pellentesque fermentum. Maecenas adipiscing ante non diam sodales hendrerit.");
}
echo time() - $stime . "\n";
?>
使用str_replace
的代码在6秒内运行。现在与strtr
函数相同:
<?php
$stime = time();
for ($i = 0; $i < 1000000; $i++) {
strtr("Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non risus. Suspendisse lectus tortor, dignissim sit amet, adipiscing nec, ultricies sed, dolor. Cras elementum ultrices diam. Maecenas ligula massa, varius a, semper congue, euismod non, mi. Proin porttitor, orci nec nonummy molestie, enim est eleifend mi, non fermentum diam nisl sit amet erat. Duis semper. Duis arcu massa, scelerisque vitae, consequat in, pretium a, enim. Pellentesque congue. Ut in risus volutpat libero pharetra tempor. Cras vestibulum bibendum augue. Praesent egestas leo in pede. Praesent blandit odio eu enim. Pellentesque sed dui ut augue blandit sodales. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Aliquam nibh. Mauris ac mauris sed pede pellentesque fermentum. Maecenas adipiscing ante non diam sodales hendrerit.", 'e', 'E');
}
echo time() - $stime . "\n";
?>
只用了4秒钟。
正如T0xicCode所述,对于这个特别简单的情况,strtr
确实比str_replace
更快,但差异并不那么显着。
答案 2 :(得分:1)
strtr()在直接替换字符时性能最佳。较长的字符串为str_replace()提供了边缘。
例如,下面的代码在我的(共享Web托管)系统上产生以下结果:
Execution timings on PHP 7.0.6:
test_strtr(): 0.37670969963074; result: Lorem ipsum dolor sit amet\, \tconsectetur adipiscing elit\, \nsed do eiusmod \%tempor \'incididunt\' ut labore et DELIMITER dolore\; trunc8 \\magna \"aliqua\".
test_str_ireplace(): 0.73557734489441; result: Lorem ipsum dolor sit amet, \tconsectetur adipiscing elit, \\nsed do eiusmod \%tempor \'incididunt\' ut labore et de-limiter dolore\; trunc8 \\magna \"aliqua\".
test_str_replace(): 0.28119778633118; result: Lorem ipsum dolor sit amet, \tconsectetur adipiscing elit, \\nsed do eiusmod \%tempor \'incididunt\' ut labore et DELIMITER dolore\; trunc8 \\magna \"aliqua\".
当我们取出'分隔符'和'截断'时,结果变为:
Execution timings on PHP 7.0.6:
test_strtr(): 0.14877104759216; result: Lorem ipsum dolor sit amet\, \tconsectetur adipiscing elit\, \nsed do eiusmod \%tempor \'incididunt\' ut labore et DELIMITER dolore\; truncate \\magna \"aliqua\".
test_str_ireplace(): 0.58186745643616; result: Lorem ipsum dolor sit amet, \tconsectetur adipiscing elit, \\nsed do eiusmod \%tempor \'incididunt\' ut labore et DELIMITER dolore\; truncate \\magna \"aliqua\".
test_str_replace(): 0.20531725883484; result: Lorem ipsum dolor sit amet, \tconsectetur adipiscing elit, \\nsed do eiusmod \%tempor \'incididunt\' ut labore et DELIMITER dolore\; truncate \\magna \"aliqua\".
因此,从PHP 7.0.6开始,strtr()会因更长的替换而受到相当大的惩罚。代码:
const LOOP = 333;
const SQL_ESCAPE_MAP = array( // see https://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet#MySQL_Escaping
"\x00" => '\x00', // NUL
"\n" => '\n', // LF
"\r" => '\r', // CR
"\\" => '\\\\', // backslash
"'" => "\'", // single quote
'"' => '\"', // double quote
"\x1a" => '\x1a', // SUB or \Z (substitute for an invalid character)
"\t" => '\t', // TAB
"\x08" => '\b', // BS
'%' => '\%', // Percent
'_' => '\_', // Underscore
';' => '\;', // Semicolon
',' => '\,', // Comma
'delimiter' => 'de-limiter', // SQL delimiter keyword
'truncate' => 'trunc8', // SQL truncate keyword
);
const SQL_SEARCH = array("\x00", "\n", "\r", "\\", "'", '"', "\x1a", "\t", "\x08", "%", ";", 'delimiter', 'truncate');
const SQL_REPLACE = array('\x00','\n','\r','\\\\',"\'",'\"', '\x1a', '\t', '\b', '\%', '\;', 'de-limiter', 'trunc8');
const TEST_STRING = "Lorem ipsum dolor sit amet, \tconsectetur adipiscing elit, \nsed do eiusmod %tempor 'incididunt' ut labore et DELIMITER dolore; truncate \magna \"aliqua\".";
function test_strtr() {
for($i= 0; $i < LOOP; $i++) {
$new_string = strtr(TEST_STRING, SQL_ESCAPE_MAP);
}
return $new_string;
}
function test_str_ireplace() {
for($i= 0; $i < LOOP; $i++) {
$new_string = str_ireplace(SQL_SEARCH, SQL_REPLACE, TEST_STRING);
}
return $new_string;
}
function test_str_replace() {
for($i= 0; $i < LOOP; $i++) {
$new_string = str_replace(SQL_SEARCH, SQL_REPLACE, TEST_STRING);
}
return $new_string;
}
$timings = array(
'test_strtr' => 0,
'test_str_ireplace' => 0,
'test_str_replace' => 0,
);
for($i= 0; $i < LOOP; $i++) {
foreach(array_keys($timings) as $func) {
$start = microtime(true);
$$func = $func();
$timings[$func] += microtime(true) - $start;
}
}
echo '<pre>Execution timings on PHP ' . phpversion('tidy') . ":\n";
foreach(array_keys($timings) as $func) {
echo $func . '(): ' . $timings[$func] . '; result: ' . $$func . "\n";
}
echo "</pre>\n";
注意:
此示例代码并不是替代数据库连接的mysqli :: real_escape_string的替代产品(二进制/多字节编码输入存在问题)。
显然,差异很小。出于助记的原因(如何组织匹配和替换),我更喜欢strtr本地采用的关联数组。 (并非strivplace的array_keys()无法实现。)这种情况下的差异肯定在微优化的范围内,并且对于不同的输入可能非常不同。如果您需要每秒处理大量字符串数千次,请使用您的特定数据进行基准测试。