如何从字符串创建一个SEO友好的划线分隔的URL?

时间:2009-01-21 15:03:41

标签: language-agnostic string seo slug

取一个字符串,如:

  

在C#中:如何在逗号分隔的字符串列表中围绕字符串添加“引号”?

并将其转换为:

  

in-c-how-do-i-add-quotes-around-string-in-a-逗号分隔的字符串列表

要求:

  • 用短划线分隔每个单词并删除所有标点符号(考虑到并非所有单词都用空格分隔。)
  • 函数占用最大长度,并获得低于该最大长度的所有标记。示例:ToSeoFriendly("hello world hello world", 14)返回"hello-world"
  • 所有单词都转换为小写。

单独注意,是否应该有最小长度?

13 个答案:

答案 0 :(得分:10)

这是我在C#中的解决方案

private string ToSeoFriendly(string title, int maxLength) {
    var match = Regex.Match(title.ToLower(), "[\\w]+");
    StringBuilder result = new StringBuilder("");
    bool maxLengthHit = false;
    while (match.Success && !maxLengthHit) {
        if (result.Length + match.Value.Length <= maxLength) {
            result.Append(match.Value + "-");
        } else {
            maxLengthHit = true;
            // Handle a situation where there is only one word and it is greater than the max length.
            if (result.Length == 0) result.Append(match.Value.Substring(0, maxLength));
        }
        match = match.NextMatch();
    }
    // Remove trailing '-'
    if (result[result.Length - 1] == '-') result.Remove(result.Length - 1, 1);
    return result.ToString();
}

答案 1 :(得分:7)

我会按照以下步骤操作:

  1. 将字符串转换为小写
  2. 用连字符替换不需要的字符
  3. 用一个连字符 替换多个连字符(不需要,因为preg_replace()函数调用已经阻止了多个连字符)
  4. 必要时删除开头和结尾的超值
  5. 根据需要从位置x之前的最后一个连字符修剪到结束
  6. 所以,一起在一个函数(PHP)中:

    function generateUrlSlug($string, $maxlen=0)
    {
        $string = trim(preg_replace('/[^a-z0-9]+/', '-', strtolower($string)), '-');
        if ($maxlen && strlen($string) > $maxlen) {
            $string = substr($string, 0, $maxlen);
            $pos = strrpos($string, '-');
            if ($pos > 0) {
                $string = substr($string, 0, $pos);
            }
        }
        return $string;
    }
    

答案 2 :(得分:4)

C#

public string toFriendly(string subject)
{
    subject = subject.Trim().ToLower();
    subject = Regex.Replace(subject, @"\s+", "-");
    subject = Regex.Replace(subject, @"[^A-Za-z0-9_-]", "");
    return subject;
}

答案 3 :(得分:2)

这是php的解决方案:

function make_uri($input, $max_length) {
  if (function_exists('iconv')) {  
    $input = @iconv('UTF-8', 'ASCII//TRANSLIT', $input);  
  }

  $lower = strtolower($input);


  $without_special = preg_replace_all('/[^a-z0-9 ]/', '', $input);
  $tokens = preg_split('/ +/', $without_special);

  $result = '';

  for ($tokens as $token) {
    if (strlen($result.'-'.$token) > $max_length+1) {
      break;
    }

    $result .= '-'.$token;       
  }

  return substr($result, 1);
}

用法:

echo make_uri('In C#: How do I add "Quotes" around string in a ...', 500);

除非你需要uris是可打字的,否则它们不需要很小。但是你应该指定一个最大值,以便url可以很好地处理代理等。

答案 4 :(得分:2)

更好的版本:

function Slugify($string)
{
    return strtolower(trim(preg_replace(array('~[^0-9a-z]~i', '~-+~'), '-', $string), '-'));
}

答案 5 :(得分:1)

Perl中的解决方案:

my $input = 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?';

my $length = 20;
$input =~ s/[^a-z0-9]+/-/gi;
$input =~ s/^(.{1,$length}).*/\L$1/;

print "$input\n";

进行。

答案 6 :(得分:1)

shell中的解决方案:

echo 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?' | \
    tr A-Z a-z | \
    sed 's/[^a-z0-9]\+/-/g;s/^\(.\{1,20\}\).*/\1/'

答案 7 :(得分:1)

这接近Stack Overflow如何产生slu ::

public static string GenerateSlug(string title)
{
    string slug = title.ToLower();
    if (slug.Length > 81)
      slug = slug.Substring(0, 81);
    slug = Regex.Replace(slug, @"[^a-z0-9\-_\./\\ ]+", "");
    slug = Regex.Replace(slug, @"[^a-z0-9]+", "-");

    if (slug[slug.Length - 1] == '-')
      slug = slug.Remove(slug.Length - 1, 1);
    return slug;
}

答案 8 :(得分:0)

至少在PHP中执行此操作的方法是:

function CleanForUrl($urlPart, $maxLength = null) {
    $url = strtolower(preg_replace(array('/[^a-z0-9\- ]/i', '/[ \-]+/'), array('', '-'), trim($urlPart)));
    if ($maxLength) $url = substr($url, 0, $maxLength);
    return $url;
}

也可以在开始时执行trim(),以便稍后处理,并在preg_replace()完成替换。

向大多数人提出要求:What is the best way to clean a string for placement in a URL, like the question name on SO?

答案 9 :(得分:0)

在动态URL中,这些ID通过查询字符串传递给作为分隔字符的脚本,因为大多数搜索引擎将短划线视为...... NET:开发人员的SEO指南也涵盖了这三个其他方法 search engine optimization

答案 10 :(得分:0)

另一个季节,另一个原因,选择Ruby :)

def seo_friendly(str)
  str.strip.downcase.gsub /\W+/, '-'
end

就是这样。

答案 11 :(得分:0)

在python中,(如果安装了django,即使你使用的是另一个框架。)

from django.template.defaultfilters import slugify
slugify("In C#: How do I add "Quotes" around string in a comma delimited list of strings?")

答案 12 :(得分:0)

为此,我们需要:

  1. 规范化文本
  2. 删除所有变音符号
  3. 替换国际字符
  4. 能够缩短文本以匹配SEO阈值

我想要一个函数来生成整个字符串,并有一个可能的最大长度的输入,这就是结果。

public static class StringHelper
{
/// <summary>
/// Creates a URL And SEO friendly slug
/// </summary>
/// <param name="text">Text to slugify</param>
/// <param name="maxLength">Max length of slug</param>
/// <returns>URL and SEO friendly string</returns>
public static string UrlFriendly(string text, int maxLength = 0)
{
    // Return empty value if text is null
    if (text == null) return "";

    var normalizedString = text
        // Make lowercase
        .ToLowerInvariant()
        // Normalize the text
        .Normalize(NormalizationForm.FormD);

    var stringBuilder = new StringBuilder();
    var stringLength = normalizedString.Length;
    var prevdash = false;
    var trueLength = 0;

    char c;

    for (int i = 0; i < stringLength; i++)
    {
        c = normalizedString[i];

        switch (CharUnicodeInfo.GetUnicodeCategory(c))
        {
            // Check if the character is a letter or a digit if the character is a
            // international character remap it to an ascii valid character
            case UnicodeCategory.LowercaseLetter:
            case UnicodeCategory.UppercaseLetter:
            case UnicodeCategory.DecimalDigitNumber:
                if (c < 128)
                    stringBuilder.Append(c);
                else
                    stringBuilder.Append(ConstHelper.RemapInternationalCharToAscii(c));

                prevdash = false;
                trueLength = stringBuilder.Length;
                break;

            // Check if the character is to be replaced by a hyphen but only if the last character wasn't
            case UnicodeCategory.SpaceSeparator:
            case UnicodeCategory.ConnectorPunctuation:
            case UnicodeCategory.DashPunctuation:
            case UnicodeCategory.OtherPunctuation:
            case UnicodeCategory.MathSymbol:
                if (!prevdash)
                {
                    stringBuilder.Append('-');
                    prevdash = true;
                    trueLength = stringBuilder.Length;
                }
                break;
        }

        // If we are at max length, stop parsing
        if (maxLength > 0 && trueLength >= maxLength)
            break;
    }

    // Trim excess hyphens
    var result = stringBuilder.ToString().Trim('-');

    // Remove any excess character to meet maxlength criteria
    return maxLength <= 0 || result.Length <= maxLength ? result : result.Substring(0, maxLength);
}
}

此帮助程序用于将一些国际字符重新映射为可读的字符。

public static class ConstHelper
{
/// <summary>
/// Remaps international characters to ascii compatible ones
/// based of: https://meta.stackexchange.com/questions/7435/non-us-ascii-characters-dropped-from-full-profile-url/7696#7696
/// </summary>
/// <param name="c">Charcter to remap</param>
/// <returns>Remapped character</returns>
public static string RemapInternationalCharToAscii(char c)
{
    string s = c.ToString().ToLowerInvariant();
    if ("àåáâäãåą".Contains(s))
    {
        return "a";
    }
    else if ("èéêëę".Contains(s))
    {
        return "e";
    }
    else if ("ìíîïı".Contains(s))
    {
        return "i";
    }
    else if ("òóôõöøőð".Contains(s))
    {
        return "o";
    }
    else if ("ùúûüŭů".Contains(s))
    {
        return "u";
    }
    else if ("çćčĉ".Contains(s))
    {
        return "c";
    }
    else if ("żźž".Contains(s))
    {
        return "z";
    }
    else if ("śşšŝ".Contains(s))
    {
        return "s";
    }
    else if ("ñń".Contains(s))
    {
        return "n";
    }
    else if ("ýÿ".Contains(s))
    {
        return "y";
    }
    else if ("ğĝ".Contains(s))
    {
        return "g";
    }
    else if (c == 'ř')
    {
        return "r";
    }
    else if (c == 'ł')
    {
        return "l";
    }
    else if (c == 'đ')
    {
        return "d";
    }
    else if (c == 'ß')
    {
        return "ss";
    }
    else if (c == 'þ')
    {
        return "th";
    }
    else if (c == 'ĥ')
    {
        return "h";
    }
    else if (c == 'ĵ')
    {
        return "j";
    }
    else
    {
        return "";
    }
}
}

该功能将像这样

const string text = "ICH MUß EINIGE CRÈME BRÛLÉE HABEN";
Console.WriteLine(StringHelper.URLFriendly(text));
// Output: 
// ich-muss-einige-creme-brulee-haben

这个问题已经回答了很多次了here,但没有一个问题得到了优化。 您可以找到带有一些示例的完整源代码here on github。 您可以从Johan Boström's Blog阅读更多内容。更多有关此功能与.NET 4.5+和.NET Core兼容。