我一直在摸不着头脑试图寻找一个解决方案,我在Google上搜索时找不到任何相关内容,所以经过几个小时的努力找到问题背后的原因,我想在这里寻求帮助..
奇怪的是,错误并不是马上发生的,而是通常在检查了几千行之后的随机时间。
应用程序本身是一个链接提取器,它从不同的URI中提取链接,然后在找到新的内部链接时,将它们添加到datagridview。
当我查看一些IntelliTrace异常时,它会说错误是在这一行引发的: See image here
或代码..
DataGridView1.Rows.Add(New String() {HttpUtility.HtmlDecode(matchUrl), anchor_txt, "", "", "", True})
对我来说,为什么抛出如此高的索引整数并不是真的有意义,特别是在这一行,当它只是添加一个新行时..无论如何,索引可能存在于一点,但是对于每次扫描,我删除不包含x字符串的URI,然后再次扫描链接,直到没有剩下的链接。
这是一个堆栈跟踪,如果它可以提供任何帮助:
System.ArgumentException: InvalidArgument=Værdi '8797' er ugyldig for 'rowIndex'.
ved System.Windows.Forms.DataGridViewRow.GetState(Int32 rowIndex)
ved System.Windows.Forms.DataGridViewRowCollection.GetRowState(Int32 rowIndex)
ved System.Windows.Forms.DataGridViewRowCollection.UpdateRowCaches(Int32 rowIndex, DataGridViewRow& dataGridViewRow, Boolean adding)
ved System.Windows.Forms.DataGridViewRowCollection.OnCollectionChanged_PreNotification(CollectionChangeAction cca, Int32 rowIndex, Int32 rowCount, DataGridViewRow& dataGridViewRow, Boolean changeIsInsertion)
ved System.Windows.Forms.DataGridViewRowCollection.OnCollectionChanged(CollectionChangeEventArgs e, Int32 rowIndex, Int32 rowCount)
ved System.Windows.Forms.DataGridViewRowCollection.AddInternal(Boolean newRow, Object[] values)
ved System.Windows.Forms.DataGridViewRowCollection.Add(Object[] values)
ved Link_Extractor.Form1.InternalThread2() i C:\Users\USER\Documents\Visual Studio 2010\Projects\Link\Link\Form1.vb:linje 568
我还收到一个"对象引用未设置为对象的实例"或者" NullReferenceException对于每个"线。我想这可能是因为单元格(0)由于某种原因而无效 - 我现在会检查这个,但仍然没有解释第一个。
For Each itm As DataGridViewRow In DataGridView1.Rows
If itm.Cells(0).Value = HttpUtility.HtmlDecode(matchUrl) Then
exists = True
End If
Next
这是堆栈跟踪:
System.NullReferenceException: Objektreferencen er ikke indstillet til en forekomst af et objekt.
ved System.Windows.Forms.DataGridViewRowCollection.OnCollectionChanged(CollectionChangeEventArgs e, Int32 rowIndex, Int32 rowCount)
ved System.Windows.Forms.DataGridViewRowCollection.AddInternal(Boolean newRow, Object[] values)
ved System.Windows.Forms.DataGridViewRowCollection.Add(Object[] values)
ved Link_Extractor.Form1.InternalThread3() i C:\Users\USER\Documents\Visual Studio 2010\Projects\Link\Link\Form1.vb:linje 808
修改 在收到html页面之后,还有一些代码。
If Not data = "" Then
Dim links As MatchCollection = Regex.Matches(data, "<a.*?href=[""']?([^'"">\ ]*)[""']?[^>]*>([\s\S]*?)<\/a>")
For Each match As Match In links
Dim matchUrl As String = HttpUtility.HtmlDecode(match.Groups(1).Value)
Dim anchor As String = HttpUtility.HtmlDecode(StripTags(match.Groups(2).Value))
'Ignore all anchor links
If matchUrl.StartsWith("#") Then
Continue For
End If
'Ignore all javascript calls
If matchUrl.ToLower.StartsWith("javascript:") Then
Continue For
End If
'Ignore all email links
If matchUrl.ToLower.StartsWith("mailto:") Then
Continue For
End If
'Ignore all URLs with @
If matchUrl.ToLower.Contains("@") Then
Continue For
End If
'Ignore all empty domains
If matchUrl Is Nothing Then
Throw New Exception("Empty matchurl")
End If
If anchor Is Nothing Then
Throw New Exception("Empty anchor text.")
End If
'For internal links, build the url mapped to the base address
If Not matchUrl.StartsWith("http://") And Not matchUrl.StartsWith("https://") Then
'Højst sansynligt internt link
matchUrl = MapUrl(url, matchUrl)
Try
exists = False
For Each itm As DataGridViewRow In DataGridView1.Rows
If Not itm.Cells(0) Is Nothing Then
If itm.Cells(0).Value = matchUrl Then
exists = True
Exit For
End If
End If
Next
If DataGridView1.Rows.Count > 0 AndAlso exists = True Then
Continue For
Else
DataGridView1.Rows.Add(New String() {matchUrl, anchor, "", "", "", True})
End If
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
Else
'It's possible that it still can be an internal link, but also external. Compare the baseaddress with the URL to check if it's still the same domain
Dim baseaddress As Uri = New Uri(url)
Dim s_baseaddress As String = baseaddress.Host.ToString
'Check for subdomain and remove
If s_baseaddress.ToCharArray().Count(Function(c) c = "."c) >= 2 Then
Dim subdomain As String = Split(s_baseaddress, ".").First
s_baseaddress = s_baseaddress.Replace(subdomain & ".", "")
End If
Dim s_url As String = Nothing
Dim url2 As Uri = Nothing
Try
url2 = New Uri(HttpUtility.HtmlDecode(matchUrl))
s_url = url2.Host.ToString
If s_url.ToCharArray().Count(Function(c) c = "."c) >= 2 Then
Dim subdomain As String = Split(s_url, ".").First
s_url = s_url.Replace(subdomain & ".", "")
End If
Catch ex As Exception
'Invalid URI
Continue For
End Try
If s_baseaddress.Equals(s_url) Then
'Internal
Try
exists = False
For Each itm As DataGridViewRow In DataGridView1.Rows
If Not itm.Cells(0) Is Nothing Then
If itm.Cells(0).Value = matchUrl Then
exists = True
Exit For
End If
End If
Next
If DataGridView1.Rows.Count > 0 AndAlso exists = True Then
Continue For
Else
DataGridView1.Rows.Add(New String() {matchUrl, anchor, "", "", "", True})
End If
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
Else
'External link
Dim m_url As String = matchUrl
m_url = m_url.Replace(" ", "")
'Trim url to root to save the time of removing duplicates
Dim theUri = Nothing
Try
theUri = New Uri(m_url)
Catch ex As Exception
'Invalid link, go to next
Continue For
End Try
Dim theDomain = theUri.GetLeftPart(UriPartial.Authority)
Try
exists = False
For Each itm As DataGridViewRow In DataGridView2.Rows
If Not itm.Cells(0) Is Nothing Then
If itm.Cells(0).Value = theDomain.ToString Then
exists = True
Exit For
End If
End If
Next
If DataGridView2.Rows.Count > 0 AndAlso exists = True Then
Continue For
Else
DataGridView2.Rows.Add(New String() {theDomain.ToString, anchor, ""})
e_links_c += 1
End If
Catch ex As Exception
MessageBox.Show(ex.Message)
End Try
End If
End If
Next
'OK
DataGridView1.Rows(thi3).Cells(2).Value = "OK"
DataGridView1.Rows(thi3).Cells(3).Value = e_links_c.ToString
Else
'Error
DataGridView1.Rows(thi3).Cells(2).Value = "Empty response"
End If
答案 0 :(得分:0)
似乎HttpUtility.HtmlDecode(matchUrl)
为您提供了无效数据。由于您的代码段有Continue For
,我的假设是您有2个For
循环。
matchUrl
有效值吗?HttpUtility.HtmlDecode(matchUrl)
值。虽然您验证了anchor_txt
。Try
和Catch
阻止。如果您不使用它,请这样做。Exit For
语句后的内部For
循环中使用exists = True
。HttpUtility.HtmlDecode(matchUrl)
,并在内部For
循环中多次评估。在For循环之前将其值赋给变量。验证输出,然后在For
循环中使用此值。尝试以下代码: -
Try
exists = False
urlDecode = HttpUtility.HtmlDecode(matchUrl) 'Put a break point on this statement and check the value of matchUrl and HttpUtility.HtmlDecode(matchUrl)
If urlDecode is Nothing Then
Throw New Exception("Empty urlDecode.")
End If
For Each itm As DataGridViewRow In DataGridView1.Rows
If itm.Cells(0).Value = urlDecode Then
exists = True
Exit For
End If
Next
If DataGridView1.Rows.Count > 0 AndAlso exists then
Continue For
Else
Dim anchor_txt As String = HttpUtility.HtmlDecode(StrpTags(match.Groups(2).Value))
If urlDecode is Nothing Then
Throw New Exception("Empty anchor text.")
End If
DataGridView1.Rows.Add(New String() {HttpUtility.HtmlDecode(matchUrl), anchor_txt, "", "", "", True})
End If
Catch ex As Exception
' Show the exception's message.
MessageBox.Show(ex.Message)
End Try