我已经编写了这个函数,用于从字符串数组中的不同值自动将性别更正为M或F.它工作正常但我的经理告诉我使用Dictionary,他说效率更高。但我不知道。有谁想帮我理解如何做到这一点?感谢。
Public Function AutoGender(ByVal dt As DataTable) As DataTable
Dim Gender As String = ""
Dim Mkeywords() As String = {"boy", "boys", "male", "man", "m", "men", "guy"}
Dim Fkeywords() As String = {"girl", "girls", "female", "woman", "f", "women", "chick"}
Dim row As DataRow
For Each row In dt.Rows
If Mkeywords.Contains(row("Gender").ToString.ToLower) Then
Gender = "M"
row("Gender") = Gender
ElseIf Fkeywords.Contains(row("Gender").ToString.ToLower) Then
Gender = "F"
row("Gender") = Gender
End If
Next
Return dt
End Function
答案 0 :(得分:7)
以下示例说明如何实现Dictionary(Of String, String)
以查找此同义词是否已知:
Shared GenderSynonyms As Dictionary(Of String, String) = New Dictionary(Of String, String) From
{{"boy", "M"}, {"boys", "M"}, {"male", "M"}, {"man", "M"}, {"m", "M"}, {"men", "M"}, {"guy", "M"},
{"girl", "F"}, {"girls", "F"}, {"female", "F"}, {"woman", "F"}, {"f", "F"}, {"women", "F"}, {"chick", "F"}}
Public Function AutoGender(ByVal dt As DataTable) As DataTable
If dt.Columns.Contains("Gender") Then
For Each row As DataRow In dt.Rows
Dim oldGender = row.Field(Of String)("Gender").ToLower
Dim newGender As String = String.Empty
If GenderSynonyms.TryGetValue(oldGender, newGender) Then
row.SetField("Gender", newGender)
End If
Next
End If
Return dt
End Function
请注意,我已使用collection initializer填充Dictionary,这是使用文字初始化集合的便捷方式。您也可以使用Add
method。
修改:另一种可能更简洁的方法是使用两个HashSet(Of String)
,一个用于男性同义词,另一个用于女性:
Shared maleSynonyms As New HashSet(Of String) From
{"boy", "boys", "male", "man", "m", "men", "guy"}
Shared femaleSynonyms As New HashSet(Of String) From
{"girl", "girls", "female", "woman", "f", "women", "chick"}
Public Function AutoGender(ByVal dt As DataTable) As DataTable
If dt.Columns.Contains("Gender") Then
For Each row As DataRow In dt.Rows
Dim oldGender = row.Field(Of String)("Gender").ToLower
Dim newGender As String = String.Empty
If maleSynonyms.Contains(oldGender) Then
row.SetField("Gender", "M")
ElseIf femaleSynonyms.Contains(oldGender) Then
row.SetField("Gender", "F")
End If
Next
End If
Return dt
End Function
HashSet
也必须是唯一的,因此它不能包含重复的Strings
(如Dictionary
中的密钥),但它不是键值对,而只是一组。
答案 1 :(得分:3)
只需将两个数组更改为字典,然后执行ContainsKey
而不是Contains
。
Dim Mkeywords = New Dictionary(Of String, String) From
{{"boy", ""}, {"boys", ""}, {"male", ""}, {"man", ""}, {"m", ""}, {"men", ""}, {"guy", ""}}
(并为女性效仿)
然而,正如你可能已经注意到我把所有那些空字符串放进去了。这是因为字典具有值和键,但由于我们不使用这些值,因此我将它们设为空字符串。要获得相同的O(1)
查找但避免所有无关的值,您可以以类似的方式使用HashSet
。
现在你需要改变的就是,就像我说的那样,使用ContainsKey
(或HashSet
,如果你走这条路,它还只是Contains
):< / p>
If Mkeywords.ContainsKey(row("Gender").ToString.ToLower) Then
最后一点说明:这只会更高效,而且效率更高。如果数据开始大幅增长。就像你拥有它一样,只有那么少的元素,使用字典甚至可能会更慢。