如何将该表解析为datagridview?

时间:2019-11-28 14:03:10

标签: vb.net parsing datagridview html-agility-pack

我找到了一个将Web表解析为csv并将数据写入计算机中的文本文件的示例,但是我想解析它而不是将计算机位置解析到我的datagridview中。

  

我不明白为什么我不能添加这个问题,因为它说:看起来   就像您的帖子主要是代码;请添加更多详细信息; Stuckoverflow不好!我不需要太多的演讲技巧,我只需要很少的单词和代码示例,多数民众赞成在该网站的目的,而不会弹出警告“写更多的人”大声笑。

这是html网站:

            <table id="example" class="display" style="width:100%">
                <thead>
                    <tr>
                        <th>Name</th>
                        <th>Position</th>
                        <th>Office</th>
                        <th>Age</th>
                        <th>Start date</th>
                        <th>Salary</th>
                    </tr>
                </thead>
                <tbody>

                    <tr>
                        <td>Jennifer Chang</td>
                        <td>Regional Director</td>
                        <td>Singapore</td>
                        <td>28</td>
                        <td>2010/11/14</td>
                        <td>$357,650</td>
                    </tr>
                    <tr>
                        <td>Brenden Wagner</td>
                        <td>Software Engineer</td>
                        <td>San Francisco</td>
                        <td>28</td>
                        <td>2011/06/07</td>
                        <td>$206,850</td>
                    </tr>       
                    <tr>
                        <td>Sakura Yamamoto</td>
                        <td>Support Engineer</td>
                        <td>Tokyo</td>
                        <td>37</td>
                        <td>2009/08/19</td>
                        <td>$139,575</td>
                    </tr>

                    <tr>
                        <td>Donna Snider</td>
                        <td>Customer Support</td>
                        <td>New York</td>
                        <td>27</td>
                        <td>2011/01/25</td>
                        <td>$112,000</td>
                    </tr>
                </tbody>
                <tfoot>
                    <tr>
                        <th>Name</th>
                        <th>Position</th>
                        <th>Office</th>
                        <th>Age</th>
                        <th>Start date</th>
                        <th>Salary</th>
                    </tr>
                </tfoot>
            </table>        

这是需要修改的代码:

Imports System.IO
Imports System.Net
Imports HtmlAgilityPack

Public Class Class1

    Public Function Demo1() As DataTable
        ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12
        Dim Document As New HtmlAgilityPack.HtmlDocument
        Dim myHttpWebRequest = CType(WebRequest.Create("https://website.com/table.html"), HttpWebRequest)
        myHttpWebRequest.UserAgent = "Mozilla/5.0 (compat ble; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
        Dim streamRead = New StreamReader(CType(myHttpWebRequest.GetResponse(), HttpWebResponse).GetResponseStream)
        Dim res As HttpWebResponse = CType(myHttpWebRequest.GetResponse(), HttpWebResponse)
        Document.Load(res.GetResponseStream, True)

        Dim table As HtmlAgilityPack.HtmlNode = Document.DocumentNode.SelectSingleNode("//table[@id='example']")

        Dim dt As New DataTable()

        If table IsNot Nothing Then
            Dim rows = table.SelectNodes("tr")
            If rows Is Nothing AndAlso table.SelectSingleNode("tbody") IsNot Nothing Then
                rows = table.SelectSingleNode("tbody").SelectNodes("tr")
            End If

            For row As Integer = 0 To rows.Count - 1
                'if row = then these are headers
                If row = 0 Then
                    Dim cols = rows(row).SelectNodes("th")
                    dt.Columns.Add(New DataColumn(cols(0).InnerText.ToString()))
                    dt.Columns.Add(New DataColumn(cols(1).InnerText.ToString()))
                    dt.Columns.Add(New DataColumn(cols(2).InnerText.ToString()))
                    dt.Columns.Add(New DataColumn(cols(3).InnerText.ToString()))
                    dt.Columns.Add(New DataColumn(cols(4).InnerText.ToString()))
                    dt.Columns.Add(New DataColumn(cols(5).InnerText.ToString()))
                Else
                    Dim cols = rows(row).SelectNodes("td")
                    Dim dr As DataRow = dt.NewRow()
                    dr(0) = cols(0).InnerText.ToString()
                    dr(1) = cols(1).InnerText.ToString()
                    dr(2) = cols(2).InnerText.ToString()
                    dr(3) = cols(3).InnerText.ToString()
                    dr(4) = cols(4).InnerText.ToString()
                    dr(5) = cols(5).InnerText.ToString()
                    dt.Rows.Add(dr)
                End If
            Next

        End If
        Return dt

    End Function

End Class

感谢帮助。

1 个答案:

答案 0 :(得分:1)

首先选择您需要的最接近的东西,可以是表,也可以是更接近数据的东西。由于我看不到确切的html,因此我以您提供的html为例(实现方法之一)。

我基本上要做的是导航到数据行,然后在它们上循环,同时为给定行的每一列抓取数据,然后将其放入DGV。

我事先在设计器中制作了DGV标头。 (ou可以通过编程方式进行,但如果是固定/单个网站,则可以手动进行)

Imports HtmlAgilityPack

Public Class Form1
    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        Dim WebDoc As New HtmlDocument
        WebDoc.LoadHtml(TextBox1.Text) 'Your url appears to be blank so icopied the html

        Dim RowNodes As HtmlNodeCollection = WebDoc.DocumentNode.SelectNodes("//tbody/tr")
        For Each _HTMLNode As HtmlNode In RowNodes
            'There's multple ways to do the column thing
            Dim ColumnCollection As String = Nothing
            For Each _SubNode As HtmlNode In _HTMLNode.SelectNodes("./td") 'Columns there apepar to be some #text nodes we dont need, probably cuzz i used textbox
                ColumnCollection &= "✄" & _SubNode.InnerText 'some non standard symbol used for spliting string later
            Next
            ColumnCollection = ColumnCollection.Substring(1) 'remove first symbol
            DataGridView1.Rows.Add(ColumnCollection.Split({"✄"}, StringSplitOptions.RemoveEmptyEntries))
        Next
    End Sub
End Class

enter image description here

请记住,所有数据都被视为字符串。如果需要,您需要其他代码来将其设置为正确的数据类型。