htmlagilitypack删除空格

时间:2014-03-11 11:40:51

标签: c# html html-agility-pack

我正在处理一个html文件,但源有很多空格。

                    <span itemscope itemtype="Company">







            <h3 class="headingLink">
                COMPANY


                    <span itemprop="name"><br/>Kroger</span>

            </h3>


              <span itemprop="streetAddress">607 Moores Mill Rd</span><br/>
              <span itemprop="addressLocality">Huntsville</span>
              <span itemprop="addressRegion">AL</span> <span itemprop="postalCode">35811</span>
              <br/><br/>

             </span> <!-- end itemscope company -->

                    <span itemscope itemtype="company">








            <h3 class="headingLink">
                COMPANY2

                    <span itemprop="name"><br/>North Militia</span>

            </h3>


              <span itemprop="streetAddress">225 N. Militia Dr</span><br/>
              <span itemprop="addressLocality">Lawrenceburg</span>
              <span itemprop="addressRegion">TN</span> <span itemprop="postalCode">38464</span>
              <br/><br/>

             </span> <!-- end itemscope company -->




                    <span itemprop="telephone">981-112-5521<br/></span>
                    <a rel="nofollow" id="aboutThisPhoneNumber9" href="javascript:void(0);" 

onclick="changeDisplay('/locator/locator/QuickHelp.do?

startsWithFilterProperty=about.this.phone.number&amp;openedItemProperty=about.this.phone.number9', 'quickHelpBox', 

'clickQuickHelp', event);return true;" class="dottedLink">About this phone number<span class="ada_hidden">-981-112-

5521                         

                <strong>Company Hours</strong>
                <br/>

大厅:

    <!-- Set request scoped variable expected by daysOfWeekRange.jsp -->













































    Mon-Thu





9-4,


    <!-- Set request scoped variable expected by daysOfWeekRange.jsp -->





























   Fri




9-5,


    <!-- Set request scoped variable expected by daysOfWeekRange.jsp -->













































    Sat-Sun




















            COMPANY open 24 hours




        <br/>(visitors need to call in)

我希望在表格中存储姓名,地址,电话号码,公司营业时间。所以,我试过

HtmlDocument hdoc = new HtmlDocument();
hdoc.OptionWriteEmptyNodes = true;
hdoc.LoadHtml(searchResult);
HtmlNodeCollection divCon = hdoc.DocumentNode.SelectNodes("//div[@class=\"resultInfo\"]");
 string[] ct = new string[divCon.Count];
for (int m = 0; m < divCon.Count; m++)
 {
  if (divCon[m].InnerText.Trim().Contains("Company Hours"))
    {
       int ctr = divCon[m].InnerText.Trim().Replace("\n", " ").Replace("<!-- Set request scoped variable expected by daysOfWeekRange.jsp -->","").Replace("  "," ").Replace("</br>","").IndexOf("Branch Hours");

                if (ctr > 0)
                    ct[m] = divCon[m].InnerText.Trim().Replace("\n", " ").Replace("<!-- Set request scoped variable expected by daysOfWeekRange.jsp -->", "").Replace("  ", " ").Replace("</br>", "").Substring(ctr, 499);

            }

我仍然得到空的空间。我该如何删除它?

THX Rashmi

0 个答案:

没有答案