是否有一个函数返回特殊字符的根字母?

时间:2009-02-09 19:20:25

标签: .net .net-3.5

.NET中的

是否有一个返回根字母的函数(没有像cedilla这样的特殊属性的字母),有点:

Select Case c
  Case "á", "à", "ã", "â", "ä", "ª" : x = "a"
  Case "é", "è", "ê", "ë" : x = "e"
  Case "í", "ì", "î", "ï" : x = "i"
  Case "ó", "ò", "õ", "ô", "ö", "º" : x = "o"
  Case "ú", "ù", "û", "ü" : x = "u"

  Case "Á", "À", "Ã", "Â", "Ä" : x = "A"
  Case "É", "È", "Ê", "Ë" : x = "E"
  Case "Í", "Ì", "Î", "Ï" : x = "I"
  Case "Ó", "Ò", "Õ", "Ô", "Ö" : x = "O"
  Case "Ú", "Ù", "Û", "Ü" : x = "U"

  Case "ç" : x = "c"
  Case "Ç" : x = "C"

  Case Else
       x = c
End Select

这段代码遗漏了一些字母,但这只是为了示例:)

4 个答案:

答案 0 :(得分:9)

答案 1 :(得分:2)

顺便说一句(与问题完全无关),您的代码对字符串进行操作。这不仅效率较低,实际上并没有真正意义,因为您对单个字符而不是字符串感兴趣,而且这些是.NET中的不同数据类型。

要获取单字符文字而不是字符串文字,请将c附加到文字:

Select Case c
  Case "á"c, "à"c, "ã"c, "â"c, "ä"c, "ª"c : x = "a"c
  ' … and so on. '
End Select

答案 2 :(得分:1)

取自Chetan Sastry的回复,在这里我给你的VB.NET代码和从他的GREAT答案复制的C#:(

VB:

Imports System.Text
Imports System.Globalization

''' <summary>
''' Removes the special attributes of the letters passed in the word
''' </summary>
''' <param name="word">Word to be normalized</param>
Function RemoveDiacritics(ByRef word As String) As String
    Dim normalizedString As String = word.Normalize(NormalizationForm.FormD)
    Dim r As StringBuilder = New StringBuilder()
    Dim i As Integer
    Dim c As Char

    For i = 0 To i < normalizedString.Length
        c = normalizedString(i)
        If (CharUnicodeInfo.GetUnicodeCategory(c) <> UnicodeCategory.NonSpacingMark) Then
            r.Append(c)
        End If
    Next

    RemoveDiacritics = r.ToString
End Function

C#

using System.Text;
using System.Globalization;

/// <summary>
/// Removes the special attributes of the letters passed in the word
/// </summary>
/// <param name="word">Word to be normalized</param>
public String RemoveDiacritics(String word)
{
  String normalizedString = word.Normalize(NormalizationForm.FormD);
  StringBuilder stringBuilder = new StringBuilder();
  int i;
  Char c;

  for (i = 0; i < normalizedString.Length; i++)
  {
    c = normalizedString[i];
    if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
  stringBuilder.Append(c);
  }

  return stringBuilder.ToString();
} 

我希望能帮助像我这样的人:)

答案 3 :(得分:0)

.NET中有简单的方法比较字符串

public static string NormalizeString(string value)
{
    string nameFormatted = value.Normalize(System.Text.NormalizationForm.FormKD);
    Regex reg = new Regex("[^a-zA-Z0-9 ]");
    return reg.Replace(nameFormatted, "");
}