我尝试使用Xpath
获取DataTable
标头。
我的输出应该是:
ItemNum |项目|的ResultCode |状态| ExtBackLinks | RefDomains | AnalysisResUnitsCost | ACRank |的ItemType | IndexedURLs | GetTopBackLinksAnalysisResUnitsCost | DownloadBacklinksAnalysisResUnitsCost | DownloadRefDomainBacklinksAnalysisResUnitsCost | RefIPs | RefSubNets | RefDomainsEDU | ExtBackLinksEDU | RefDomainsGOV | ExtBackLinksGOV | RefDomainsEDU_Exact | ExtBackLinksEDU_Exact | RefDomainsGOV_Exact | ExtBackLinksGOV_Exact | CrawledFlag | LastCrawlDate | LastCrawlResult | RedirectFlag | FinalRedirectResult | OutDomainsExternal | OutLinksExternal | OutLinksInternal | OutLinksPages | LastSeen |标题| RedirectTo |语言LanguageDesc | LanguageConfidence | LanguagePageRatios | LanguageTotalPages | RefLanguage | RefLanguageDesc | RefLanguageConfidence | RefLanguagePageRatios | RefLanguageTotalPages | CrawledURLs | RootDomainIPAddress | TotalNonUniqueLinks | NonUniqueLinkTypeHomepages | NonUniqueLinkTypeIndirect | NonUniqueLinkTypeDeleted | NonUniqueLinkTypeNoFollow | NonUniqueLinkTypeProtocolHTTPS | NonUniqueLinkTypeFrame | NonUniqueLinkTypeImageLink | NonUniqueLinkTypeRedirect | NonUni queLinkTypeTextLink | RefDomainTypeLive | RefDomainTypeFollow | RefDomainTypeHomepageLink | RefDomainTypeDirect | RefDomainTypeProtocolHTTPS | CitationFlow | TrustFlow | TrustMetric | TopicalTrustFlow_Topic_0 | TopicalTrustFlow_Value_0 | TopicalTrustFlow_Topic_1 | TopicalTrust_Flow_Value_1 | TopicalTrustFlow_Value_1
这是原始XML:
<Result Code="OK" ErrorMessage="" FullError="">
<GlobalVars FirstBackLinkDate="2012-09-21" IndexBuildDate="2018-05-24 19:47:18" IndexType="0" MostRecentBackLinkDate="2018-04-23" QueriedRootDomains="1" QueriedSubDomains="0" QueriedURLs="0" QueriedURLsMayExist="0" ServerBuild="2018-06-11 13:52:01" ServerName="BRUNO28" ServerVersion="1.0.6736.23160" UniqueIndexID="20180524194718-HISTORICAL"/>
<DataTables Count="1">
<DataTable Name="Results" RowsCount="1" Headers="ItemNum|Item|ResultCode|Status|ExtBackLinks|RefDomains|AnalysisResUnitsCost|ACRank|ItemType|IndexedURLs|GetTopBackLinksAnalysisResUnitsCost|DownloadBacklinksAnalysisResUnitsCost|DownloadRefDomainBacklinksAnalysisResUnitsCost|RefIPs|RefSubNets|RefDomainsEDU|ExtBackLinksEDU|RefDomainsGOV|ExtBackLinksGOV|RefDomainsEDU_Exact|ExtBackLinksEDU_Exact|RefDomainsGOV_Exact|ExtBackLinksGOV_Exact|CrawledFlag|LastCrawlDate|LastCrawlResult|RedirectFlag|FinalRedirectResult|OutDomainsExternal|OutLinksExternal|OutLinksInternal|OutLinksPages|LastSeen|Title|RedirectTo|Language|LanguageDesc|LanguageConfidence|LanguagePageRatios|LanguageTotalPages|RefLanguage|RefLanguageDesc|RefLanguageConfidence|RefLanguagePageRatios|RefLanguageTotalPages|CrawledURLs|RootDomainIPAddress|TotalNonUniqueLinks|NonUniqueLinkTypeHomepages|NonUniqueLinkTypeIndirect|NonUniqueLinkTypeDeleted|NonUniqueLinkTypeNoFollow|NonUniqueLinkTypeProtocolHTTPS|NonUniqueLinkTypeFrame|NonUniqueLinkTypeImageLink|NonUniqueLinkTypeRedirect|NonUniqueLinkTypeTextLink|RefDomainTypeLive|RefDomainTypeFollow|RefDomainTypeHomepageLink|RefDomainTypeDirect|RefDomainTypeProtocolHTTPS|CitationFlow|TrustFlow|TrustMetric|TopicalTrustFlow_Topic_0|TopicalTrustFlow_Value_0|TopicalTrustFlow_Topic_1|TopicalTrustFlow_Value_1|TopicalTrustFlow_Topic_2|TopicalTrustFlow_Value_2" MaxTopicsRootDomain="30" MaxTopicsSubDomain="20" MaxTopicsURL="10" TopicsCount="3">
<Row>
0|nu.nl|OK|Found|508322106|165344|508322106|-1|1|4149991|5000|512472097|3356880|59147|26204|233|3613|43|308|73|1757|4|12|False| | |True| |5|10|44|1722150| |NU - Het laatste nieuws het eerst op NU.nl|https://www.nu.nl/|nl|Dutch/Flemish|92|99.9|482980|nl,en,de|Dutch/Flemish,English,German|87,93,58|96.5,3.1,0.1|76319583|1915923|52.85.201.19|611833777|15034990|53120677|444371798|95283418|52384870|388104|53497551|5655999|552292123|102171|115787|21952|150164|49554|76|70|70|News/Breaking News|69|Sports/Resources|45|Arts/Radio|43
</Row>
</DataTable>
</DataTables>
</Result>
当我在 Google表格中使用此Xpath
命令时:
=importxml("http://enterprise.majesticseo.com/api_command?privatekey=xxx&accessToken=xxx&cmd=GetIndexItemInfo&item0=nu.nl&items=1","//DataTable"
我得到行结果。很棒,但是我还需要在工作表的第一行中添加标题名称。
答案 0 :(得分:3)
XPath简介:-)
使用//DataTable
,您将获得XML中任何位置的任何<DataTable>
(此处不涉及名称空间)的完整节点。
根据经验,最好尽可能具体一些(而不是使用/Result/DataTables/DataTable
)。但这不是您问题的答案...
想象一下这样的XML:
<root>
<innerNode attr="1"><a>Some a content</a><b>Some b content</b></innerNode>
<innerNode attr="2"><a>aaa</a><b>bbb</b></innerNode>
</root>
使用/root/innerNode
,您将同时获得<innerNode>
和所有内容。
使用/root/innerNode[(b/text())[1]="bbb"]
只会得到一个<innerNode>
,其中<b>
的{{1}}是text()
使用"bbb"
,您将得到一个/root/innerNode[@attr="1"]
,其中属性<innerNode>
的值为“ 2”。
所有三个attr
样本都带回整个节点,包括子节点,属性等等。
如果仅需要属性的值,则必须要求它:
XPath
...返回第二个(/root/innerNode/@attr)[2]
(实际上是第二次出现)的属性值
<innerNode>
...返回/root/innerNode[(b/text())[1]="Some b content"]/@attr
的属性值,其中<innerNode>
的值为<b>
0f text()
您想读取位于"Some b content"
的元素Headers
中的属性<DataTable>
。只需使用
/Result/DataTables