Question

我想下载维基百科api页面的内容，并通过替换页面中的字符来显示其文本框。

示例：链接到页面

http://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext=1&titles=stack%20overflow

我想要所有不需要的东西，比如

{"query":{"normalized":[{"from":"stack overflow","to":"Stack overflow"}],"pages":{"1436888":{"pageid":1436888,"ns":0,"title":"Stack overflow","extract":

应该被替换。

我已经尝试了这个，但它不适合这个页面

textbox1.Text = XDocument.Parse(new Regex("[[(.*?]]").Matches(new WebClient().DownloadString("http://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext=1&titles=stack%20overflow")[0].Value).Root.Value;

Answer 1

获取文本的最佳方法是使用JSON反序列化响应。以下是使用Newtonsoft JSON解析器的示例。由于您对完整响应不感兴趣，因此您只需要反序列化使您获得节点名称“extract”所需的部分。

using System;
using System.Collections.Generic;
using System.Net;
using System.IO;
using System.Linq;
using Newtonsoft.Json;

public class Program
{
    public static void Main()
    {
        WebClient client = new WebClient();

        using (Stream stream = client.OpenRead("http://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext=1&titles=stack%20overflow"))
        using (StreamReader reader = new StreamReader(stream))
        {
            JsonSerializer ser = new JsonSerializer();
            Result result = ser.Deserialize<Result>(new JsonTextReader(reader));

            foreach(Page page in result.query.pages.Values)
                Console.WriteLine(page.extract);
        }
    }
}

public class Result
{
    public Query query { get; set; }
}

public class Query
{
    public Dictionary<string, Page> pages { get; set; }
}

public class Page
{   
    public string extract { get; set; }
}

以下是代码工作的小提琴：https://dotnetfiddle.net/xGv7lG

Answer 2

除了JSON，你没有得到XDocument / XML。

您可以在此处为JSON创建一个C＃类：http://json2csharp.com/

下载Newtonsoft JSON的NuGet包

所以你可以做到

var deserialized = JsonConvert.DeserializeObject<YourJsonRootClass>(rawJsonString);

然后，您可以访问＆＃34;反序列化对象＆＃34;中的页面内容。

或者你也可以动态解析JSON，但我现在不在这里解释。

如何在C＃中的文本框中获取维基百科内容

2 个答案: