以下代码将每行拆分为单词,并将每行中的第一个单词存储到数组列表中,将第二个单词存储到另一个数组列表中,依此类推。然后,它从每个列表中选择最常用的单词作为正确的单词。
Module Module1
Sub Main()
Dim correctLine As String = ""
Dim line1 As String = "Canda has more than ones official language"
Dim line2 As String = "Canada has more than one oficial languages"
Dim line3 As String = "Canada has nore than one official lnguage"
Dim line4 As String = "Canada has nore than one offical language"
Dim wordsOfLine1() As String = line1.Split(" ")
Dim wordsOfLine2() As String = line2.Split(" ")
Dim wordsOfLine3() As String = line3.Split(" ")
Dim wordsOfLine4() As String = line4.Split(" ")
For i As Integer = 0 To wordsOfLine1.Length - 1
Dim wordAllLinesTemp As New List(Of String)(New String() {wordsOfLine1(i), wordsOfLine2(i), wordsOfLine3(i), wordsOfLine4(i)})
Dim counts = From n In wordAllLinesTemp
Group n By n Into Group
Order By Group.Count() Descending
Select Group.First
correctLine = correctLine & counts.First & " "
Next
correctLine = correctLine.Remove(correctLine.Length - 1)
Console.WriteLine(correctLine)
Console.ReadKey()
End Sub
End Module
我的问题:如何使用不同数量的单词行。我的意思是这里每行的长度是7个单词,for循环使用这个长度(长度为1)。假设第3行包含5个单词。
答案 0 :(得分:0)
编辑:无意中有正确的索引应该是最短的。
据我所知,你试图看哪条线最接近正确的线。
您可以使用以下代码获取levenshtein距离:
Formatter fmt = new Formatter();
double value = Double.parseDouble(fmt.format("%.2f", x).toString());
然后,这将用于确定哪条线最接近:
Public Function LevDist(ByVal s As String,
ByVal t As String) As Integer
Dim n As Integer = s.Length
Dim m As Integer = t.Length
Dim d(n + 1, m + 1) As Integer
If n = 0 Then
Return m
End If
If m = 0 Then
Return n
End If
Dim i As Integer
Dim j As Integer
For i = 0 To n
d(i, 0) = i
Next
For j = 0 To m
d(0, j) = j
Next
For i = 1 To n
For j = 1 To m
Dim cost As Integer
If t(j - 1) = s(i - 1) Then
cost = 0
Else
cost = 1
End If
d(i, j) = Math.Min(Math.Min(d(i - 1, j) + 1, d(i, j - 1) + 1),
d(i - 1, j - 1) + cost)
Next
Next
Return d(n, m)
End Function