我有一个以.aspx结尾的网页,其中包含一个表格。我想知道是否有任何方法可以使用C#将该表转换为CSV文件? 我的HTML代码是这样的:
<div class="section_content">
<div>
<table class="table table-bordered table-condensed table-striped" cellspacing="0" id="gv_report">
<thead>
<tr>
<th scope="col">Lot</th><th scope="col">Op</th><th scope="col">Status</th><th scope="col">iDispo Status</th><th scope="col">Dispo By</th><th scope="col">Dispo Date</th><th scope="col">T.R Count</th><th scope="col">View</th>
</tr>
</thead><tbody>
<tr>
<td>7649B703</td><td>6262</td><td>FAIL</td><td>FAIL</td><td>mly2</td><td>12/10/2016 4:30:47 PM</td><td>1</td><td>
<a href='/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703&Location=6262'
target="_blank"><i class="icon-eye-open"></i></a>
</td>
答案 0 :(得分:0)
您可以尝试使用Cinchoo ETL - 一个开源文件助手来解析,转换和编写不同格式的文件。
在您的情况下,您希望将HTML表格转换为CSV文件。我将向您展示示例代码如何操作
对于示例HTML表格
<table id="example1" border="1" style="background-color:#FFFFCC" width="0%" cellpadding="3" cellspacing="3">
<tr>
<th>Title</th>
<th>Name</th>
<th>Phone</th>
</tr>
<tr>
<td>Mr.</td>
<td>John</td>
<td>07868785831</td>
</tr>
<tr>
<td>Miss</td>
<td>Linda</td>
<td>0141-2244-5566</td>
</tr>
<tr>
<td>Master</td>
<td>Jack</td>
<td>0142-1212-1234</td>
</tr>
<tr>
<td>Mr.</td>
<td>Bush</td>
<td>911-911-911</td>
</tr>
</table>
我将展示如何抓取标题,名称,电话列值并生成CSV文件
using (var cr = new ChoCSVWriter("HtmlTable.csv").WithFirstLineHeader())
{
using (var xr = new ChoXmlReader("HTMLTable.xml").WithXPath("/table/tr")
.WithField("Title", xPath: "td[1]", fieldType: typeof(string))
.WithField("Name", xPath: "td[2]", fieldType: typeof(string))
.WithField("Phone", xPath: "td[3]", fieldType: typeof(string))
)
{
cr.Write(xr.Where(r => !((string)r.Title).IsNullOrWhiteSpace()));
}
}
在上面,我使用列规范和xpath打开了XmlReader,以找到各个字段。然后将阅读器传递给CSVWriter以创建CSV文件
以下是HTMLtable的CSV输出
Title,Name,Phone
Mr.,John,07868785831
Miss,Linda,0141-2244-5566
Master,Jack,0142-1212-1234
Mr.,Bush,911-911-911
希望它有所帮助。
<强>更新强>
您的xml几乎没有更正
<div class="section_content">
<div>
<table class="table table-bordered table-condensed table-striped" cellspacing="0" id="gv_report">
<thead>
<tr>
<th scope="col">Lot</th>
<th scope="col">Op</th>
<th scope="col">Status</th>
<th scope="col">iDispo Status</th>
<th scope="col">Dispo By</th>
<th scope="col">Dispo Date</th>
<th scope="col">T.R Count</th>
<th scope="col">View</th>
</tr>
</thead>
<tbody>
<tr>
<td>7649B703</td>
<td>6262</td>
<td>FAIL</td>
<td>FAIL</td>
<td>mly2</td>
<td>12/10/2016 4:30:47 PM</td>
<td>1</td>
<td>
<a href='/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703&Location=6262' target="_blank">
<i class="icon-eye-open"></i>
</a>
</td>
</tr>
<tr>
<td>7649B703</td>
<td>6262</td>
<td>FAIL</td>
<td>FAIL</td>
<td>mly2</td>
<td>12/10/2016 4:30:47 PM</td>
<td>1</td>
<td>
<a href='/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703&Location=6262' target="_blank">
<i class="icon-eye-open"></i>
</a>
</td>
</tr>
</tbody>
</table>
</div>
</div>
以下是将它们转换为CSV文件的代码
using (var cr = new ChoCSVWriter("HtmlTable.csv").WithFirstLineHeader())
{
using (var xr = new ChoXmlReader("HTMLTable.xml").WithXPath("//tbody/tr")
.WithField("Lot", xPath: "td[1]", fieldType: typeof(int))
.WithField("Op", xPath: "td[2]", fieldType: typeof(int))
.WithField("Status", xPath: "td[3]", fieldType: typeof(string))
.WithField("iDispoStatus", xPath: "td[4]", fieldType: typeof(string))
.WithField("DispoBy", xPath: "td[5]", fieldType: typeof(string))
.WithField("DispoDate", xPath: "td[6]", fieldType: typeof(DateTime))
.WithField("TRCount", xPath: "td[7]", fieldType: typeof(int))
.WithField("View", xPath: "td[8]/a/@href", fieldType: typeof(string))
)
{
cr.Write(xr);
}
}
CSV输出:
Lot,Op,Status,iDispoStatus,DispoBy,DispoDate,TRCount,View
0,6262,FAIL,FAIL,mly2,12/10/2016,1,/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B70
0,6262,FAIL,FAIL,mly2,12/10/2016,1,/SS_PROD/Report/LotDispoHistSummRepPopUp.aspx?Lot=7649B703