我正在尝试从Wikipedia页面获取电影情节和其他信息。我有电影名称和年份,从中我必须找到准确的电影及其各自的情节以及其他信息。
我收到以下答复
{
"batchcomplete": "",
"continue": {
"sroffset": 10,
"continue": "-||"
},
"query": {
"searchinfo": {
"totalhits": 176
},
"search": [
{
"ns": 0,
"title": "The Matrix",
"pageid": 30007,
"size": 123422,
"wordcount": 12668,
"snippet": "The <span class=\"searchmatch\">Matrix</span> is a 1999 science fiction action film written and directed by the Wachowskis that stars Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss,",
"timestamp": "2019-05-17T20:53:05Z"
},
我需要搜索所有电影,而不仅仅是英语电影。我需要直接从搜索中获取绘图部分文本。
答案 0 :(得分:1)
首次安装:
$ pip3 install imdbpy wikipedia
然后:
>>> import wikipedia
>>> from imdb import IMDb
>>> imdb = IMDb()
>>> imdb.search_movie('avengers')
[<Movie id:0848228[http] title:_The Avengers (2012)_>, <Movie id:0203247[http] title:_"Avengers: United They Stand" (1999)_>, <Movie id:2164490[http] title:_Avengers (1987) (VG)_>, <Movie id:4154796[http] title:_Avengers: Endgame (2019)_>, <Movie id:4154756[http] title:_Avengers: Infinity War (2018)_>, <Movie id:2395427[http] title:_Avengers: Age of Ultron (2015)_>, <Movie id:2455546[http] title:_"Avengers Assemble" (2013)_>, <Movie id:1626038[http] title:_"The Avengers: Earth's Mightiest Heroes" (2010)_>, <Movie id:0458339[http] title:_Captain America: The First Avenger (2011)_>, <Movie id:0118661[http] title:_The Avengers (1998)_>, <Movie id:0054518[http] title:_"The Avengers" (1961)_>, <Movie id:1355644[http] title:_Passengers (I) (2016)_>, <Movie id:8836988[http] title:_Avengement (I) (2019)_>, <Movie id:0473445[http] title:_Avenger (2006) (TV)_>, <Movie id:9426186[http] title:_Revenger (2018)_>, <Movie id:2378453[http] title:_Avenged (2013)_>, <Movie id:4296026[http] title:_Avengers Grimm (2015) (V)_>, <Movie id:0491703[http] title:_Ultimate Avengers (2006) (V)_>, <Movie id:0090190[http] title:_The Toxic Avenger (1984)_>, <Movie id:0056174[http] title:_The Avenger (1962)_>]
>>> title = imdb.search_movie('avengers')[0].data['title']
'The Avengers'
>>> wiki_page = wikipedia.page(title)
>>> wiki_page.url
'https://en.wikipedia.org/wiki/Avengers_(comics)'
>>> print(wiki_page.content)
请参阅: