我目前正在使用ngOnInit() {
$...add your code here...
}
来获取给定维基百科页面的类别(例如pywikibot
),如下所示。
support-vector machine
我得到的结果是:
import pywikibot as pw
print([i.title() for i in list(pw.Page(pw.Site('en'), 'support-vector machine').categories())])
如您所见,我得到的结果包括许多维基百科的跟踪和维护类别,例如;
但是,我只感兴趣的类别是
我想知道是否有一种方法可以获取所有[
'Category:All articles with specifically marked weasel-worded phrases',
'Category:All articles with unsourced statements',
'Category:Articles with specifically marked weasel-worded phrases from May 2018',
'Category:Articles with unsourced statements from June 2013',
'Category:Articles with unsourced statements from March 2017',
'Category:Articles with unsourced statements from March 2018',
'Category:CS1 maint: Uses editors parameter',
'Category:Classification algorithms',
'Category:Statistical classification',
'Category:Support vector machines',
'Category:Wikipedia articles needing clarification from November 2017',
'Category:Wikipedia articles with BNF identifiers',
'Category:Wikipedia articles with GND identifiers',
'Category:Wikipedia articles with LCCN identifiers'
]
维基百科类别,以便可以从结果中删除它们,从而仅获取信息丰富的类别。
或者,如果有其他方法可以从结果中消除它们,请提出建议。
很高兴在需要时提供更多详细信息。
答案 0 :(得分:2)
pywikibot
当前不提供某些API features来过滤隐藏类别。您可以通过在hidden
中搜索categoryinfo
键来手动完成此操作:
import pywikibot as pw
site = pw.Site('en', 'wikipedia')
print([
cat.title()
for cat in pw.Page(site, 'support-vector machine').categories()
if 'hidden' not in cat.categoryinfo
])
给予:
['Category:Classification algorithms',
'Category:Statistical classification',
'Category:Support vector machines']
有关更多信息,请参见https://www.mediawiki.org/wiki/Help:Categories#Hidden_categories和https://en.wikipedia.org/wiki/Wikipedia:Categorization#Hiding_categories。