以下是我正在使用的XML文件片段:
<page>
<title>AccessibleComputing</title>
<ns>0</ns>
<id>10</id>
<redirect title="Computer accessibility" />
<revision>
<id>381202555</id>
<parentid>381200179</parentid>
<timestamp>2010-08-26T22:38:36Z</timestamp>
<contributor>
<username>OlEnglish</username>
<id>7181920</id>
</contributor>
<minor />
<comment>[[Help:Reverting|Reverted]] edits by [[Special:Contributions/76.28.186.133|76.28.186.133]] ([[User talk:76.28.186.133|talk]]) to last version by Gurch</comment>
<text xml:space="preserve">#REDIRECT [[Computer accessibility]] {{R from CamelCase}}</text>
<sha1>lo15ponaybcg2sf49sstw9gdjmdetnk</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
<page>
<title>AfghanistanGeography</title>
<ns>0</ns>
<id>14</id>
<redirect title="Geography of Afghanistan" />
<revision>
<id>407008307</id>
<parentid>74466619</parentid>
<timestamp>2011-01-10T03:56:19Z</timestamp>
<contributor>
<username>Graham87</username>
<id>194203</id>
</contributor>
<minor />
<comment>1 revision from [[:nost:AfghanistanGeography]]: import old edit, see [[User:Graham87/Import]]</comment>
<text xml:space="preserve">#REDIRECT [[Geography of Afghanistan]] {{R from CamelCase}}</text>
<sha1>0uwuuhiam59ufbu0uzt9lookwtx9f4r</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
<page>
<title>AfghanistanPeople</title>
<ns>0</ns>
<id>15</id>
<redirect title="Demography of Afghanistan" />
<revision>
<id>135089040</id>
<parentid>74466558</parentid>
<timestamp>2007-06-01T13:59:37Z</timestamp>
<contributor>
<username>RussBot</username>
<id>279219</id>
</contributor>
<minor />
<comment>Robot: Fixing [[Special:DoubleRedirects|double-redirect]] -"Demographics of Afghanistan" +"Demography of Afghanistan"</comment>
<text xml:space="preserve">#REDIRECT [[Demography of Afghanistan]] {{R from CamelCase}}</text>
<sha1>744dgrl7ef5p53yffn2a989ly1dyr8f</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
现在,给定值“AccessibleComputing”如何检索XMLInternalElementNode(对应于'AccessibleComputing'?我尝试使用getNodeSet但没有成功。
感谢。
更新了问题
我首先应该提到整个sample.xml文件。这是它。我面临的问题如下:
<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.8/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.8/ http://www.mediawiki.org/xml/export-0.8.xsd" version="0.8" xml:lang="en">
<siteinfo>
<sitename>Wikipedia</sitename>
<base>http://en.wikipedia.org/wiki/Main_Page</base>
<generator>MediaWiki 1.21wmf8</generator>
<case>first-letter</case>
<namespaces>
<namespace key="-2" case="first-letter">Media</namespace>
<namespace key="-1" case="first-letter">Special</namespace>
<namespace key="0" case="first-letter" />
<namespace key="1" case="first-letter">Talk</namespace>
<namespace key="2" case="first-letter">User</namespace>
<namespace key="3" case="first-letter">User talk</namespace>
<namespace key="4" case="first-letter">Wikipedia</namespace>
<namespace key="5" case="first-letter">Wikipedia talk</namespace>
<namespace key="6" case="first-letter">File</namespace>
<namespace key="7" case="first-letter">File talk</namespace>
<namespace key="8" case="first-letter">MediaWiki</namespace>
<namespace key="9" case="first-letter">MediaWiki talk</namespace>
<namespace key="10" case="first-letter">Template</namespace>
<namespace key="11" case="first-letter">Template talk</namespace>
<namespace key="12" case="first-letter">Help</namespace>
<namespace key="13" case="first-letter">Help talk</namespace>
<namespace key="14" case="first-letter">Category</namespace>
<namespace key="15" case="first-letter">Category talk</namespace>
<namespace key="100" case="first-letter">Portal</namespace>
<namespace key="101" case="first-letter">Portal talk</namespace>
<namespace key="108" case="first-letter">Book</namespace>
<namespace key="109" case="first-letter">Book talk</namespace>
<namespace key="446" case="first-letter">Education Program</namespace>
<namespace key="447" case="first-letter">Education Program talk</namespace>
<namespace key="710" case="first-letter">TimedText</namespace>
<namespace key="711" case="first-letter">TimedText talk</namespace>
</namespaces>
</siteinfo>
<page>
<title>AccessibleComputing</title>
<ns>0</ns>
<id>10</id>
<redirect title="Computer accessibility" />
<revision>
<id>381202555</id>
<parentid>381200179</parentid>
<timestamp>2010-08-26T22:38:36Z</timestamp>
<contributor>
<username>OlEnglish</username>
<id>7181920</id>
</contributor>
<minor />
<comment>[[Help:Reverting|Reverted]] edits by [[Special:Contributions/76.28.186.133|76.28.186.133]] ([[User talk:76.28.186.133|talk]]) to last version by Gurch</comment>
<text xml:space="preserve">#REDIRECT [[Computer accessibility]] {{R from CamelCase}}</text>
<sha1>lo15ponaybcg2sf49sstw9gdjmdetnk</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
<page>
<title>History</title>
<ns>0</ns>
<id>13</id>
<redirect title="History of " />
<revision>
<id>74466652</id>
<parentid>15898948</parentid>
<timestamp>2006-09-08T04:15:52Z</timestamp>
<contributor>
<username>Rory096</username>
<id>750223</id>
</contributor>
<comment>cat rd</comment>
<text xml:space="preserve">#REDIRECT [[History of ]] {{R from CamelCase}}</text>
<sha1>d4tdz2eojqzamnuockahzcbrgd1t9oi</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
<page>
<title>Geography</title>
<ns>0</ns>
<id>14</id>
<redirect title="Geography of " />
<revision>
<id>407008307</id>
<parentid>74466619</parentid>
<timestamp>2011-01-10T03:56:19Z</timestamp>
<contributor>
<username>Graham87</username>
<id>194203</id>
</contributor>
<minor />
<comment>1 revision from [[:nost:Geography]]: import old edit, see [[User:Graham87/Import]]</comment>
<text xml:space="preserve">#REDIRECT [[Geography of ]] {{R from CamelCase}}</text>
<sha1>0uwuuhiam59ufbu0uzt9lookwtx9f4r</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
<page>
<title>People</title>
<ns>0</ns>
<id>15</id>
<redirect title="Demography of " />
<revision>
<id>135089040</id>
<parentid>74466558</parentid>
<timestamp>2007-06-01T13:59:37Z</timestamp>
<contributor>
<username>RussBot</username>
<id>279219</id>
</contributor>
<minor />
<comment>Robot: Fixing [[Special:DoubleRedirects|double-redirect]] -"Demographics of " +"Demography of "</comment>
<text xml:space="preserve">#REDIRECT [[Demography of ]] {{R from CamelCase}}</text>
<sha1>744dgrl7ef5p53yffn2a989ly1dyr8f</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
</mediawiki>
我如何获得标题元素值为“AccessibleComputing”的页面节点。我尝试了以下方法:
doc = xmlTreeParse('sample.xml',useInternalNodes=TRUE)
getNodeSet(doc, "//page[title=\"AccessibleComputing\"]")
它返回了
list()
attr(,"class")
[1] "XMLNodeSet"
预期产出:
[[1]]
<page>
<title>AccessibleComputing</title>
<ns>0</ns>
<id>10</id>
<redirect title="Computer accessibility"/>
<revision>
<id>381202555</id>
<parentid>381200179</parentid>
<timestamp>2010-08-26T22:38:36Z</timestamp>
<contributor>
<username>OlEnglish</username>
<id>7181920</id>
</contributor>
<minor/>
<comment>[[Help:Reverting|Reverted]] edits by [[Special:Contributions/76.28.186.133|76.28.186.133]] ([[User talk:76.28.186.133|talk]]) to last version by Gurch</comment>
<text xml:space="preserve">#REDIRECT [[Computer accessibility]] {{R from CamelCase}} </text>
<sha1>lo15ponaybcg2sf49sstw9gdjmdetnk</sha1>
<model>wikitext</model>
<format>text/x-wiki</format>
</revision>
</page>
attr(,"class")
[1] "XMLNodeSet"
我想我的XPath查询不正确 - 有一次出现'siteinfo'节点打破了我的尝试。有什么建议。
答案 0 :(得分:2)
要解析您的文件,我添加了一个新标签
<pages>
....
</pages>
然后使用xpathSApply
,我可以检索所有标题元素:
library(XML)
doc = xmlTreeParse('c:/temp/testxml.xml',useInternalNodes=TRUE)
xpathSApply(doc,'//page/title',xmlValue)
"AccessibleComputing" "AfghanistanGeography" "AfghanistanPeople"
你也可以getNodeSet
:
getNodeSet(doc,'//page/title')
[[1]]
<title>AccessibleComputing</title>
[[2]]
<title>AfghanistanGeography</title>
[[3]]
<title>AfghanistanPeople</title>
答案 1 :(得分:0)
如果您希望获得标题值为AccessibleComputing
的任何网页,则应使用getNodeSet(doc,'//page[title="AccessibleComputing"]')
如果要获取任何具有名为title的直接子节点的节点,其值为
AccessibleComputing
然后你应该使用getNodeSet(doc,'//node()[title="AccessibleComputing"]')
library(XML)
xml <- "<pages><page>\n<title>AccessibleComputing</title>\n<ns>0</ns>\n<id>10</id>\n<redirect title=\"Computer accessibility\" />\n<revision>\n<id>381202555</id>\n<parentid>381200179</parentid>\n<timestamp>2010-08-26T22:38:36Z</timestamp>\n<contributor>\n<username>OlEnglish</username>\n<id>7181920</id>\n</contributor>\n<minor />\n<comment>[[Help:Reverting|Reverted]] edits by [[Special:Contributions/76.28.186.133|76.28.186.133]] ([[User talk:76.28.186.133|talk]]) to last version by Gurch</comment>\n<text xml:space=\"preserve\"> %InLiNe_IdEnTiFiEr% \"#REDIRECT [[Computer accessibility]] {{R from CamelCase}}</text>\"\n<sha1>lo15ponaybcg2sf49sstw9gdjmdetnk</sha1>\n<model>wikitext</model>\n<format>text/x-wiki</format>\n</revision>\n</page>\n<page>\n<title>AfghanistanGeography</title>\n<ns>0</ns>\n<id>14</id>\n<redirect title=\"Geography of Afghanistan\" />\n<revision>\n<id>407008307</id>\n<parentid>74466619</parentid>\n<timestamp>2011-01-10T03:56:19Z</timestamp>\n<contributor>\n<username>Graham87</username>\n<id>194203</id>\n</contributor>\n<minor />\n<comment>1 revision from [[:nost:AfghanistanGeography]]: import old edit, see [[User:Graham87/Import]]</comment>\n<text xml:space=\"preserve\"> %InLiNe_IdEnTiFiEr% \"#REDIRECT [[Geography of Afghanistan]] {{R from CamelCase}}</text>\"\n<sha1>0uwuuhiam59ufbu0uzt9lookwtx9f4r</sha1>\n<model>wikitext</model>\n<format>text/x-wiki</format>\n</revision>\n</page>\n<page>\n<title>AfghanistanPeople</title>\n<ns>0</ns>\n<id>15</id>\n<redirect title=\"Demography of Afghanistan\" />\n<revision>\n<id>135089040</id>\n<parentid>74466558</parentid>\n<timestamp>2007-06-01T13:59:37Z</timestamp>\n<contributor>\n<username>RussBot</username>\n<id>279219</id>\n</contributor>\n<minor />\n<comment>Robot: Fixing [[Special:DoubleRedirects|double-redirect]] -"Demographics of Afghanistan" +"Demography of Afghanistan"</comment>\n<text xml:space=\"preserve\"> %InLiNe_IdEnTiFiEr% \"#REDIRECT [[Demography of Afghanistan]] {{R from CamelCase}}</text>\"\n<sha1>744dgrl7ef5p53yffn2a989ly1dyr8f</sha1>\n<model>wikitext</model>\n<format>text/x-wiki</format>\n</revision>\n</page></pages>"
doc = xmlTreeParse(xml, useInternalNodes = TRUE)
# If you want to get page which has immediate child node called title whose
# value is 'AccessibleComputing'
getNodeSet(doc, "//page[title=\"AccessibleComputing\"]")
## [[1]]
## <page>
## <title>AccessibleComputing</title>
## <ns>0</ns>
## <id>10</id>
## <redirect title="Computer accessibility"/>
## <revision><id>381202555</id><parentid>381200179</parentid><timestamp>2010-08-26T22:38:36Z</timestamp><contributor><username>OlEnglish</username><id>7181920</id></contributor><minor/><comment>[[Help:Reverting|Reverted]] edits by [[Special:Contributions/76.28.186.133|76.28.186.133]] ([[User talk:76.28.186.133|talk]]) to last version by Gurch</comment><text xml:space="preserve"> %InLiNe_IdEnTiFiEr% "#REDIRECT [[Computer accessibility]] {{R from CamelCase}}</text>"
## <sha1>lo15ponaybcg2sf49sstw9gdjmdetnk</sha1><model>wikitext</model><format>text/x-wiki</format></revision>
## </page>
##
## attr(,"class")
## [1] "XMLNodeSet"
# If you want to get any node which has immediate child node called title whose
# value is 'AccessibleComputing'
getNodeSet(doc, "//node()[title=\"AccessibleComputing\"]")
## [[1]]
## <page>
## <title>AccessibleComputing</title>
## <ns>0</ns>
## <id>10</id>
## <redirect title="Computer accessibility"/>
## <revision><id>381202555</id><parentid>381200179</parentid><timestamp>2010-08-26T22:38:36Z</timestamp><contributor><username>OlEnglish</username><id>7181920</id></contributor><minor/><comment>[[Help:Reverting|Reverted]] edits by [[Special:Contributions/76.28.186.133|76.28.186.133]] ([[User talk:76.28.186.133|talk]]) to last version by Gurch</comment><text xml:space="preserve"> %InLiNe_IdEnTiFiEr% "#REDIRECT [[Computer accessibility]] {{R from CamelCase}}</text>"
## <sha1>lo15ponaybcg2sf49sstw9gdjmdetnk</sha1><model>wikitext</model><format>text/x-wiki</format></revision>
## </page>
##
## attr(,"class")
## [1] "XMLNodeSet"