webbrowser table area td to How scraped information in textbox1?

时间:2015-09-01 22:51:17

标签: c# regex winforms split htmlelements

I am want scraped information from website where available product file name & profile serial number.

How I am scraped product serial number if always coming new serial & below process show html code?

<pre> <td><b>product file number </b> 7269</td  </pre> 
<pre> <td><b>product file number </b> 7562</td> </pre> 
<pre> <td><b>product file number </b> 7502</td> </pre>

I am new windows form application area so Please provide me full code for good help. I am really happy if you help me.

1 个答案:

答案 0 :(得分:0)

您可以将数据视为XML

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication45
{
    class Program
    {
        static void Main(string[] args)
        {
            string input =
               "<pre> <td><b>product file number </b> 7269</td>  </pre>" +
               "<pre> <td><b>product file number </b> 7562</td> </pre>" +
               "<pre> <td><b>product file number </b> 7502</td> </pre>";

            //add root tag around data so you have only one root tag
            input = string.Format("<Root>{0}</Root>", input);

            XElement root = XElement.Parse(input);
            var products = root.Descendants("pre").Select(x => new {
                name = x.Descendants("b").FirstOrDefault().Value,
                number = int.Parse(x.Element("td").Nodes().Skip(1).Take(1).FirstOrDefault().ToString())
            }).ToList();


        }

    }

}