mshtml.HTMLTableCell具有不正确的innerText值

时间:2014-03-04 13:23:15

标签: c# html mshtml innertext

当我尝试解析HTMLTableCell innerText值不正确时,似乎我得到了类名而不是文本。

现在奇怪的是,当我在调试中查看单元格(在VS2010中)时,我看到正确的值我做错了什么?

进一步调查提出了这个问题,当我查看VS2010中的值时,它看起来像这样 cell.innerText “中心时间”((mshtml.HTMLTableCellClass)(单元格))。innerText “23:45 “即可。问题是,当我转换为mshtml.HTMLTableCellClass时它不会编译,所以我必须使用接口(为什么会这样?)

请参阅以下代码:

mshtml.HTMLDocument doc = MainBrowser.Document as mshtml.HTMLDocument;

if (doc != null)
{

    mshtml.HTMLTable table = doc.getElementById("ecEventsTable") as mshtml.HTMLTable;

    List<List<string>> textRows = new List<List<string>>();

    foreach (mshtml.HTMLTableRow row in table.rows)
    {
        if (row != null && row.id != null && row.id.Contains("eventRowId"))
        {
            List<string> temp = new List<string>();

            foreach (mshtml.HTMLTableCell cell in row.cells)
            {
                string text = cell.innerText;
                if (text != null && text != "" && text != " ")
                {
                    if (text.Contains("\r\n"))
                        text = text.Replace("\r\n", "");

                    temp.Add(cell.innerText);
                }
            }

            if (temp.Count > 0)
                textRows.Add(temp);
        }
    }

    foreach (var row in textRows)
    {
        string str = String.Join(" ", row);
    }
}

}

HTML示例行:

<tr id="eventRowId_34599" onclick="javascript:changeEventDisplay(34599, this, 'overview');" event_timestamp="2014-02-24 01:30:00" event_attr_id="752"> <td class="center time">01:30</td> <td class="flagCur"><span title="China" class=" ceFlags China">&nbsp;</span>CNY</td> <td title="" class="sentiment"><i class="newSiteIconsSprite grayFullBullishIcon middle"></i><i class="newSiteIconsSprite grayEmptyBullishIcon middle"></i> <i class="newSiteIconsSprite grayEmptyBullishIcon middle"></i></td>
<td class="left event">China House Prices (YoY)</td> <td title="" class="bold act blackFont" id="eventActual_34599">9.6%</td> <td class="fore" id="eventForecast_34599">&nbsp;</td> <td class="prev blackFont" id="eventPrevious_34599">9.9%</td> <td class="diamond" id="eventRevisedFrom_34599">&nbsp;</td> </tr>

1 个答案:

答案 0 :(得分:3)

我没有使用 mshtml.HTMLTableCell ,而是使用 mshtml.IHTMLElement ,现在可以使用了。

修复后的

代码(请参阅问题中的旧版本):

foreach (mshtml.IHTMLElement cell in row.cells)
{
    string text = cell.innerText;

    if (text != null && text != "" && text != " ")
    {
        if (text.Contains("\r\n"))
            text = text.Replace("\r\n", "");

        temp.Add(text);
    }
}