我需要为页面生成slug(例如/ my-page-slug),其中slug将从不断增长的概念列表中生成。这些概念可以包含特殊字符,变音符号,标点符号等。
我的目标是在考虑URL可读性和搜索引擎优化的同时,提出一个强大的套管生成策略,避免未来的冲突。
我查看了[RFC 3986] [1]以及维基百科和Quora等网站,了解他们如何处理特定案例,包括:
到目前为止,似乎没有明确的标准或最佳做法。
是否有特定的库已经处理过这个问题?或者我是否必须实施自己的自定义解决方案?
现在,我正在考虑执行以下操作的自定义解决方案:
这是一个概念验证:这种方法是否有方向性?
function generateSlug($topic) {
// URL encode
$topic = rawurlencode($topic); // encodes according to RFC 3986: http://www.faqs.org/rfcs/rfc3986.html
// Transform specific characters
$topic = preg_replace('/%E2%80%93/', '-', $topic); // decode en dash as hyphen
$topic = preg_replace('/%E2%80%94/', '--', $topic); // decode em dash as double-hyphen
$topic = preg_replace('/%E2%80%A6/', '%20', $topic); // convert ellipsis to space
$topic = preg_replace('/%26/', '%20', $topic); // convert ampersand to space
$topic = preg_replace('/%2F/', '%20', $topic); // convert forward to space
$topic = preg_replace('/%3F/', '', $topic); // strip out question marks
$topic = preg_replace('/%28/', '(', $topic); // decode opening parenthesis
$topic = preg_replace('/%29/', ')', $topic); // decode closing parenthesis
$topic = preg_replace('/%21/', '!', $topic); // decode exclamation mark
$topic = preg_replace('/%27/', '', $topic); // strip apostrophes
$topic = preg_replace('/%22/', '', $topic); // strip double quotation
$topic = preg_replace('/%2A/', '*', $topic); // decode asterisk
$topic = preg_replace('/%2C/', '', $topic); // strip comma
$topic = preg_replace('/%3A/', '', $topic); // strip colon
$topic = preg_replace('/%3B/', '', $topic); // strip semicolon
$topic = trim($topic); // remove leading and trailing spaces
$topic = preg_replace('/(%20)+/', '-', $topic); // convert one or more spaces into single space
return $topic;
}
答案 0 :(得分:0)
这样的事情应该可以胜任:
public static function formatUrlPermalink ($var)
{
$permasearch = explode(',', "À,Á,Â,Ã,Å,à,á,â,ã,å,Ò,Ó,Ô,Õ,Ø,ò,ó,ô,õ,ø,È,É,Ê,Ë,è,é,ê,ë,Ç,ç,Ì,Í,Î,Ï,ì,í,î,ï,Ù,Ú,Û,ù,ú,û,ÿ,Ñ,ñ,ß,ä,Ä,ö,Ö,ü,Ü");
$permareplace = explode(',', "A,A,A,A,A,a,a,a,a,a,O,O,O,O,O,o,o,o,o,o,E,E,E,E,e,e,e,e,C,c,I,I,I,I,i,i,i,i,U,U,U,u,u,u,y,N,n,ss,ae,Ae,oe,Oe,ue,Ue");
foreach ($permasearch as $key => $value) {
$var = mb_ereg_replace ($value, $permareplace[$key], $var);
}
$var = preg_replace ("#(\s*\/\s*|\s*\+\s*|\s+)#", '-', strtolower($var));
$permalinksseparator = '-';
$var = mb_ereg_replace ("[^a-z0-9_{$permalinksseparator}]", '', $var, "imsr");
$var = preg_replace ('/'.$permalinksseparator.'+/', $permalinksseparator, $var); // remove replicated separator
$var = trim ($var, $permalinksseparator);
return $var;
}
您可以在前两行看到如何根据需要调整特殊字符。其余的只是删除空格并用' - '替换它们。 ($ permalinksseparator的值)