使用Special:Export从Wikipedia获取数据

时间:2012-12-08 13:04:44

标签: asp.net xml vb.net mediawiki wikipedia-api

我正在尝试使用Special:Export

从维基百科中获取数据

以下是我的标记,我无法理解为什么它没有进入while循环?我没有遇到任何错误。请帮助。

Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

        Dim webRequest As System.Net.HttpWebRequest = CType(System.Net.WebRequest.Create("http://en.wikipedia.org/wiki/Special:Export/Train"), HttpWebRequest)
        webRequest.Credentials = System.Net.CredentialCache.DefaultCredentials
        webRequest.Accept = "text/xml"
        webRequest.UserAgent = "foo/bar"
        Dim webResponse As System.Net.HttpWebResponse = CType(webRequest.GetResponse, HttpWebResponse)
        Dim responseStream As System.IO.Stream = webResponse.GetResponseStream
        Dim reader As System.Xml.XmlTextReader = New XmlTextReader(responseStream)
        Dim NS As String = "http://www.mediawiki.org/xml/export-0.4/"
        Dim doc As XPathDocument = New XPathDocument(reader)
        reader.Close()
        webResponse.Close()
        Dim myXPathNavigator As XPathNavigator = doc.CreateNavigator
        Dim nodesText As XPathNodeIterator = myXPathNavigator.SelectDescendants("text", NS, False)

        While nodesText.MoveNext
            Response.Write((nodesText.Current.InnerXml + " "))
        End While
    End Sub

1 个答案:

答案 0 :(得分:0)

特别:出口应该是POST。请参阅Special:Export manual

但是,你真的不应该这样做。请使用web API client library for your language of choice并访问export module