从aspx页面解析到文本文档

时间:2012-08-03 00:30:31

标签: c# vb.net

使用C#或VB.Net代码从链接下载(http://24.173.220.131/carter/currentinmates.aspx)。然后将页面中的属性解析为文本文档。

输出:

名称| BookDate |收费|保释|发布|代理 安德森,JAYME RAMONE | 05/04/2012 | SENTENCED | 0.00美元| 5/2/2022 |自︰ ANDERSON,JEFFERY CONARD | 02/06/2012 | SENTENCED | $ 0.00 | 2/5/2022 | CARTER COUNTY SHERIFF DEPT

2 个答案:

答案 0 :(得分:2)

添加对CsQuery的引用,将其安装在NuGet中或在此处找到https://github.com/jamietre/CsQuery

using System;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Text;
using CsQuery;

class Program
{
    static void Main(string[] args)
    {

        var stringBuilder = new StringBuilder();
        var url = "http://24.173.220.131/carter/currentinmates.aspx";
        CQ.CreateFromUrlAsync(url)
           .Then(response =>
           {
               var dom = response.Dom;
               var trs = dom.Select("#dgrdLandRecords tr").Elements;
               foreach (var row in trs)
               {
                   stringBuilder.AppendLine();
                   var tds = row.ChildElements.ToList();

                   for (int i = 1; i < tds.Count; i++)
                   {
                       stringBuilder.Append(tds[i].Cq().Text());
                       stringBuilder.Append("|");
                   }
               }
               var result = stringBuilder.ToString();
               Console.Write(result);
           });


        Console.WriteLine("Press any key to exit.");
        Console.ReadKey();
    }
}

答案 1 :(得分:0)

使用WebClient class正是您正在寻找的内容。

Public Class Test

Public Shared Sub Main(args() As String)
    Dim sURL as String
    If args Is Nothing OrElse args.Length = 0 Then
        'Throw New ApplicationException("Specify the URI of the resource to retrieve.")
         sURL = http://24.173.220.131/carter/currentinmates.aspx"
    Else
        sURL = args(0)
    End If
    Dim client As New WebClient()

    ' Add a user agent header in case the 
    ' requested URI contains a query.
    client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")

    Dim data As Stream = client.OpenRead(sURL)
    Dim reader As New StreamReader(data)
    Dim s As String = reader.ReadToEnd()
    Console.WriteLine(s)

    'Here write the variable `s` to a Text file, eg My.File.Create(s)

    data.Close()
    reader.Close()
End Sub 'Main
End Class 'Test