Question

我正在试图找出一种方法来使用toTitleCase来忽略序数。除了序数之外，它对所有字符串都有效（例如，第1，第2，第3变为1St，2Nd，3Rd）。

任何帮助将不胜感激。正则表达式可能是解决这个问题的方法，我只是不确定如何构造这样的正则表达式。

更新：这是我使用的解决方案（使用John在下面的扩展方法中写的答案）：

public static string ToTitleCaseIgnoreOrdinals(this string text)
{
    string input = System.Globalization.CultureInfo.CurrentCulture.TextInfo.ToTitleCase(text);
    string result = System.Text.RegularExpressions.Regex.Replace(input, "([0-9]st)|([0-9]th)|([0-9]rd)|([0-9]nd)", new System.Text.RegularExpressions.MatchEvaluator((m) => m.Captures[0].Value.ToLower()), System.Text.RegularExpressions.RegexOptions.IgnoreCase);
    return result;
}

Answer 1

string input =  System.Globalization.CultureInfo.CurrentCulture.TextInfo.ToTitleCase("hello there, this is the 1st");
string result = System.Text.RegularExpressions.Regex.Replace(input, "([0-9]st)|([0-9]th)|([0-9]rd)|([0-9]nd)", new System.Text.RegularExpressions.MatchEvaluator((m) =>
{
    return m.Captures[0].Value.ToLower();
}), System.Text.RegularExpressions.RegexOptions.IgnoreCase);

Answer 2

在转换为Title Case之前，您可以使用正则表达式检查字符串是否以数字开头，如下所示：

if (!Regex.IsMatch(text, @"^\d+"))
{
   CultureInfo.CurrentCulture.TextInfo.toTitleCase(text);
}

编辑：忘记撤消条件...已更改，因此如果它匹配，则会应用于标题框。

第二次编辑：添加循环以检查句子中的所有单词：

string text = "150 east 40th street";



            string[] array = text.Split(' ');


            for (int i = 0; i < array.Length; i++)
            {
                if (!Regex.IsMatch(array[i], @"^\d+"))
                {
                    array[i] = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(array[i]);
                }
            }


            string newText = string.Join(" ",array);

Answer 3

这适用于那些字符串，你可以通过Extension方法覆盖ToTitleCase（）。

string s = "1st";

if (   s[0] >= '0' && s[0] <= '9' ) {
   //this string starts with a number
   //so don't call ToTitleCase()
}
else {  //call ToTileCase() }

Answer 4

您只需使用String.Replace（或StringBuilder.Replace）：

string[] ordinals = { "1St", "2Nd", "3Rd" };  // add all others
string text = "This is just sample text which contains some ordinals, the 1st, the 2nd and the third.";
var sb = new StringBuilder(CultureInfo.InvariantCulture.TextInfo.ToTitleCase(text));
foreach (string ordinal in ordinals)
    sb.Replace(ordinal, ordinal.ToLowerInvariant());
text = sb.ToString();

这根本不优雅。它要求你保持无限第一行的序号列表。我假设这就是为什么有人低估了你。

它并不优雅，但它比其他简单方法（如正则表达式）效果更好。您希望在较长的文本中使用title-case 单词。但只有不是序数的词。序数是f.e.第1，第2或第3和第31但不是第31。因此，简单的正则表达式解决方案将很快失败。您还需要10m到10M等标题字词（其中M可能是百万的缩写）。

所以我不明白为什么维护一系列序数会如此糟糕。

您甚至可以使用上限自动生成它们，例如：

public static IEnumerable<string> GetTitleCaseOrdinalNumbers()
{
    for (int num = 1; num <= int.MaxValue; num++)
    {
        switch (num % 100)
        {
            case 11:
            case 12:
            case 13:
                yield return num + "Th";
                break;
        }

        switch (num % 10)
        {
            case 1:
                yield return num + "St"; break;
            case 2:
                yield return num + "Nd"; break;
            case 3:
                yield return num + "Rd"; break;
            default:
                yield return num + "Th"; break;
        }
    }
}

因此，如果您想检查前1000个序号：

foreach (string ordinal in GetTitleCaseOrdinalNumbers().Take(1000)) 
   sb.Replace(ordinal, ordinal.ToLowerInvariant());

<强>更新

为了它的价值，我试图提供一种有效的方法来真正检查单词（而不仅仅是子串）并跳过真正代表序数的单词ToTitleCase（所以不是{{ 1}}但是31th例如）。它还处理不是空格的分隔符字符（如点或逗号）：

31st

请注意，这还没有经过测试，但应该给你一个想法。

Answer 5

我会将文本拆分并迭代生成的数组，跳过不以字母开头的内容。

        using System.Globalization;

        TextInfo textInfo = new CultureInfo("en-US", false).TextInfo;

        string[] text = myString.Split();
        for(int i = 0; i < text.Length; i++)
        {   //Check for zero-length strings, because these will throw an
            //index out of range exception in Char.IsLetter
            if (text[i].Length > 0 && Char.IsLetter(text[i][0]))
            {
                text[i] = textInfo.ToTitleCase(text[i]);
            }

        }

toTitleCase忽略C＃中的序数

5 个答案: