使用C#从.aspx网页获取表到CSV文件

时间:2018-01-15 03:09:39

标签: c# datatable

我有一个以.aspx结尾的网页,其中包含一个表格。我想知道是否有任何方法可以使用C#将该表转换为CSV文件? 我的HTML代码是这样的:

<div class="section_content">
    <div>
<table class="table table-bordered table-condensed table-striped" cellspacing="0" id="gv_report">
    <thead>
        <tr>
            <th scope="col">Lot</th><th scope="col">Op</th><th scope="col">Status</th><th scope="col">iDispo Status</th><th scope="col">Dispo By</th><th scope="col">Dispo Date</th><th scope="col">T.R Count</th><th scope="col">View</th>
        </tr>
    </thead><tbody>
        <tr>
            <td>7649B703</td><td>6262</td><td>FAIL</td><td>FAIL</td><td>mly2</td><td>12/10/2016 4:30:47 PM</td><td>1</td><td>
                    <a href='/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703&Location=6262'
                        target="_blank"><i class="icon-eye-open"></i></a>
                </td>

1 个答案:

答案 0 :(得分:0)

您可以尝试使用Cinchoo ETL - 一个开源文件助手来解析,转换和编写不同格式的文件。

在您的情况下,您希望将HTML表格转换为CSV文件。我将向您展示示例代码如何操作

对于示例HTML表格

<table id="example1" border="1"  style="background-color:#FFFFCC" width="0%" cellpadding="3" cellspacing="3">
  <tr>
    <th>Title</th>
    <th>Name</th>
    <th>Phone</th>
  </tr>
  <tr>
    <td>Mr.</td>
    <td>John</td>
    <td>07868785831</td>
  </tr>
  <tr>
    <td>Miss</td>
    <td>Linda</td>
    <td>0141-2244-5566</td>
  </tr>
  <tr>
    <td>Master</td>
    <td>Jack</td>
    <td>0142-1212-1234</td>
  </tr>
  <tr>
    <td>Mr.</td>
    <td>Bush</td>
    <td>911-911-911</td>
  </tr>
</table>

我将展示如何抓取标题,名称,电话列值并生成CSV文件

using (var cr = new ChoCSVWriter("HtmlTable.csv").WithFirstLineHeader())
{
    using (var xr = new ChoXmlReader("HTMLTable.xml").WithXPath("/table/tr")
        .WithField("Title", xPath: "td[1]", fieldType: typeof(string))
        .WithField("Name", xPath: "td[2]", fieldType: typeof(string))
        .WithField("Phone", xPath: "td[3]", fieldType: typeof(string))
    )
    {
        cr.Write(xr.Where(r => !((string)r.Title).IsNullOrWhiteSpace()));
    }
}

在上面,我使用列规范和xpath打开了XmlReader,以找到各个字段。然后将阅读器传递给CSVWriter以创建CSV文件

以下是HTMLtable的CSV输出

Title,Name,Phone
Mr.,John,07868785831
Miss,Linda,0141-2244-5566
Master,Jack,0142-1212-1234
Mr.,Bush,911-911-911

希望它有所帮助。

<强>更新

您的xml几乎没有更正

<div class="section_content">
  <div>
    <table class="table table-bordered table-condensed table-striped" cellspacing="0" id="gv_report">
      <thead>
        <tr>
          <th scope="col">Lot</th>
          <th scope="col">Op</th>
          <th scope="col">Status</th>
          <th scope="col">iDispo Status</th>
          <th scope="col">Dispo By</th>
          <th scope="col">Dispo Date</th>
          <th scope="col">T.R Count</th>
          <th scope="col">View</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <td>7649B703</td>
          <td>6262</td>
          <td>FAIL</td>
          <td>FAIL</td>
          <td>mly2</td>
          <td>12/10/2016 4:30:47 PM</td>
          <td>1</td>
          <td>
            <a href='/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703&Location=6262' target="_blank">
              <i class="icon-eye-open"></i>
            </a>
          </td>
        </tr>
        <tr>
          <td>7649B703</td>
          <td>6262</td>
          <td>FAIL</td>
          <td>FAIL</td>
          <td>mly2</td>
          <td>12/10/2016 4:30:47 PM</td>
          <td>1</td>
          <td>
            <a href='/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703&Location=6262' target="_blank">
              <i class="icon-eye-open"></i>
            </a>
          </td>
        </tr>
      </tbody>
    </table>
  </div>
</div>

以下是将它们转换为CSV文件的代码

using (var cr = new ChoCSVWriter("HtmlTable.csv").WithFirstLineHeader())
{
    using (var xr = new ChoXmlReader("HTMLTable.xml").WithXPath("//tbody/tr")
        .WithField("Lot", xPath: "td[1]", fieldType: typeof(int))
        .WithField("Op", xPath: "td[2]", fieldType: typeof(int))
        .WithField("Status", xPath: "td[3]", fieldType: typeof(string))
        .WithField("iDispoStatus", xPath: "td[4]", fieldType: typeof(string))
        .WithField("DispoBy", xPath: "td[5]", fieldType: typeof(string))
        .WithField("DispoDate", xPath: "td[6]", fieldType: typeof(DateTime))
        .WithField("TRCount", xPath: "td[7]", fieldType: typeof(int))
        .WithField("View", xPath: "td[8]/a/@href", fieldType: typeof(string))
    )
    {
        cr.Write(xr);
    }
}

CSV输出:

Lot,Op,Status,iDispoStatus,DispoBy,DispoDate,TRCount,View
0,6262,FAIL,FAIL,mly2,12/10/2016,1,/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B70
0,6262,FAIL,FAIL,mly2,12/10/2016,1,/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703