如何从字符串中获取指定类的所有内容?

时间:2016-08-23 23:47:40

标签: c# xaml

我正在尝试为我的论坛制作Windows手机应用程序。 这是我到目前为止的功能(只是采取xaml):

private async void go_Click(object sender, RoutedEventArgs e)
    {
        HttpClient wc = new HttpClient();
        HttpResponseMessage response = await wc.GetAsync("http://www.myforum.com/");
        response.EnsureSuccessStatusCode();
        string xaml = await response.Content.ReadAsStringAsync();
        textXAML.Text = json;
    }

我的代码中没有错误,我得到了xaml。

我想要做的是获取论坛的所有类别名称。所有类别名称都有一个"类别名称"。

如何获取类别名称?我可以从字符串中获取它们吗?我是否必须解析字符串或什么?

1 个答案:

答案 0 :(得分:0)

我认为关于codeproject的这篇文章会对你有所帮助:

http://www.codeproject.com/Tips/804660/How-to-Parse-Html-using-csharp

它使用一个名为Htmlagilitypack的库,你可以通过NuGet安装:

http://www.nuget.org/packages/HtmlAgilityPack

以下示例可能适用于您的目的,但由于防火墙问题,我无法对其进行测试,但希望它能够正常运行并有助于回答您的问题。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Text;
using HtmlAgilityPack;

namespace HtmlParserDemo
{
    class Program
    {
        // Update the URL to the page you are trying to parse
        private const string Url = "http://www.bing.com/";
        private const string TagName = "a";
        private const string ClassName = "forumtitle";

        static void Main(string[] args)
        {
            try
            {
                Console.WriteLine("Getting HTML from: {0}", Url);

                foreach (var category in GetCategories(Url, TagName, ClassName))
                {
                    Console.WriteLine(category);
                }
            }
            catch (Exception exception)
            {
                while (exception != null)
                {
                    Console.WriteLine(exception);
                    exception = exception.InnerException;
                }
            }
            finally
            {
                Console.WriteLine("Press any key to exit...");
                Console.ReadKey(true);
            }
        }

        public static IEnumerable<string> GetCategories(string url, string htmlTag, string className = "")
        {
            var response = new HttpClient().GetByteArrayAsync(url).Result;
            string source = Encoding.GetEncoding("utf-8").GetString(response, 0, response.Length - 1);
            source = WebUtility.HtmlDecode(source);
            var result = new HtmlDocument();
            result.LoadHtml(source);

            return result.DocumentNode.Descendants()
                .Where(node =>  node.Name == htmlTag && 
                                (string.IsNullOrEmpty(className) || (node.Attributes["class"] != null &&
                                 node.Attributes["class"].Value == className)))
                .Select(node => node.InnerText);
        }
    }
}