在哪里下载电影数据集?

时间:2012-09-06 05:55:32

标签: web-scraping dataset imdb

我想在单个文件中下载包含电影名称和演员列表等基本信息的电影转储。我找了几个选项,例如http://api.themoviedb.org/2.1/http://api.themoviedb.org/2.1/。 TheMovieDB没有提供批量下载数据的选项。 IMDB有数据,但它似乎分散在各个文件中。此外,我无法弄清楚如何拼接演员,电影名称等单独文件中的数据,因为它们似乎没有任何常用键。如果我在这里遗漏了一些东西,请告诉我。

有人可以告诉我如何下载电影数据集吗?

1 个答案:

答案 0 :(得分:2)

您可以使用Freebase以JSON格式下载moviesactors。有关详细信息,请参阅API wiki

例如,查询:

GET https://www.googleapis.com/freebase/v1/mqlread?query=[{%22type%22:%22/film/actor%22,%22id%22:null,%22name%22:null}]

将返回:

{
  "result": [{
    "type": "/film/actor",
    "id": "/en/milla_jovovich",
    "name": "Milla Jovovich"
  }, {
    "type": "/film/actor",
    "id": "/en/angus_macfadyen",
    "name": "Angus Macfadyen"
  }, {
    "type": "/film/actor",
    "id": "/en/aisha_tyler",
    "name": "Aisha Tyler"
  }, {
    "type": "/film/actor",
    "id": "/en/stephen_dorff",
    "name": "Stephen Dorff"
  }, {
    "type": "/film/actor",
    "id": "/en/vincent_laresca",
    "name": "Vincent Laresca"
  }, {
    "type": "/film/actor",
    "id": "/en/dawn_greenhalgh",
    "name": "Dawn Greenhalgh"
  }, {
    "type": "/film/actor",
    "id": "/en/nola_augustson",
    "name": "Nola Augustson"
  }, {
    "type": "/film/actor",
    "id": "/en/dudley_moore",
    "name": "Dudley Moore"
  }, {
    "type": "/film/actor",
    "id": "/en/julie_andrews",
    "name": "Julie Andrews"
  }, {
    "type": "/film/actor",
    "id": "/en/bo_derek",
    "name": "Bo Derek"
  }, {
    "type": "/film/actor",
    "id": "/en/robert_webber",
    "name": "Robert Webber"
  }, {
    "type": "/film/actor",
    "id": "/en/dee_wallace-stone",
    "name": "Dee Wallace-Stone"
  }, {
    "type": "/film/actor",
    "id": "/en/ryan_phillippe",
    "name": "Ryan Phillippe"
  }, {
    "type": "/film/actor",
    "id": "/en/salma_hayek",
    "name": "Salma Hayek"
  }, {
    "type": "/film/actor",
    "id": "/en/neve_campbell",
    "name": "Neve Campbell"
  }, {
    "type": "/film/actor",
    "id": "/en/mike_myers",
    "name": "Mike Myers"
  }, {
    "type": "/film/actor",
    "id": "/en/satoshi_tsumabuki",
    "name": "Satoshi Tsumabuki"
  }, {
    "type": "/film/actor",
    "id": "/en/masanobu_ando",
    "name": "Masanobu Ando"
  }, {
    "type": "/film/actor",
    "id": "/en/david_gahan",
    "name": "Dave Gahan"
  }, {
    "type": "/film/actor",
    "id": "/en/martin_gore",
    "name": "Martin Gore"
  }, {
    "type": "/film/actor",
    "id": "/en/andrew_fletcher_1961",
    "name": "Andrew Fletcher"
  }, {
    "type": "/film/actor",
    "id": "/en/alan_wilder",
    "name": "Alan Wilder"
  }, {
    "type": "/film/actor",
    "id": "/en/gerard_butler",
    "name": "Gerard Butler"
  }, {
    "type": "/film/actor",
    "id": "/en/lena_headey",
    "name": "Lena Headey"
  }, {
    "type": "/film/actor",
    "id": "/en/david_wenham",
    "name": "David Wenham"
  }, {
    "type": "/film/actor",
    "id": "/en/robert_de_niro",
    "name": "Robert De Niro"
  }, {
    "type": "/film/actor",
    "id": "/en/gerard_depardieu",
    "name": "G\u00e9rard Depardieu"
  }, {
    "type": "/film/actor",
    "id": "/en/dominique_sanda",
    "name": "Dominique Sanda"
  }, {
    "type": "/film/actor",
    "id": "/en/john_belushi",
    "name": "John Belushi"
  }, {
    "type": "/film/actor",
    "id": "/en/ned_beatty",
    "name": "Ned Beatty"
  }, {
    "type": "/film/actor",
    "id": "/en/dan_aykroyd",
    "name": "Dan Aykroyd"
  }, {
    "type": "/film/actor",
    "id": "/en/lorraine_gary",
    "name": "Lorraine Gary"
  }, {
    "type": "/film/actor",
    "id": "/en/murray_hamilton",
    "name": "Murray Hamilton"
  }, {
    "type": "/film/actor",
    "id": "/en/robert_downey_jr",
    "name": "Robert Downey Jr."
  }, {
    "type": "/film/actor",
    "id": "/en/kiefer_sutherland",
    "name": "Kiefer Sutherland"
  }, {
    "type": "/film/actor",
    "id": "/en/winona_ryder",
    "name": "Winona Ryder"
  }, {
    "type": "/film/actor",
    "id": "/en/john_hurt",
    "name": "John Hurt"
  }, {
    "type": "/film/actor",
    "id": "/en/richard_burton",
    "name": "Richard Burton"
  }, {
    "type": "/film/actor",
    "id": "/en/suzanna_hamilton",
    "name": "Suzanna Hamilton"
  }, {
    "type": "/film/actor",
    "id": "/en/cyril_cusack",
    "name": "Cyril Cusack"
  }, {
    "type": "/film/actor",
    "id": "/en/gregor_fisher",
    "name": "Gregor Fisher"
  }, {
    "type": "/film/actor",
    "id": "/en/tony_leung_chiu_wai",
    "name": "Tony Leung Chiu Wai"
  }, {
    "type": "/film/actor",
    "id": "/en/gong_li",
    "name": "Gong Li"
  }, {
    "type": "/film/actor",
    "id": "/en/faye_wong",
    "name": "Faye Wong"
  }, {
    "type": "/film/actor",
    "id": "/en/takuya_kimura",
    "name": "Takuya Kimura"
  }, {
    "type": "/film/actor",
    "id": "/en/zhang_ziyi",
    "name": "Zhang Ziyi"
  }, {
    "type": "/film/actor",
    "id": "/en/carina_lau",
    "name": "Carina Lau"
  }, {
    "type": "/film/actor",
    "id": "/en/chang_chen",
    "name": "Chang Chen"
  }, {
    "type": "/film/actor",
    "id": "/en/bird_mcintyre",
    "name": "Bird McIntyre"
  }, {
    "type": "/film/actor",
    "id": "/en/maggie_cheung",
    "name": "Maggie Cheung"
  }, {
    "type": "/film/actor",
    "id": "/en/chevy_chase",
    "name": "Chevy Chase"
  }, {
    "type": "/film/actor",
    "id": "/en/steve_martin",
    "name": "Steve Martin"
  }, {
    "type": "/film/actor",
    "id": "/en/martin_short",
    "name": "Martin Short"
  }, {
    "type": "/film/actor",
    "id": "/en/joe_mantegna",
    "name": "Joe Mantegna"
  }, {
    "type": "/film/actor",
    "id": "/en/jon_lovitz",
    "name": "Jon Lovitz"
  }, {
    "type": "/film/actor",
    "id": "/en/alfonso_arau",
    "name": "Alfonso Arau"
  }, {
    "type": "/film/actor",
    "id": "/en/tony_plana",
    "name": "Tony Plana"
  }, {
    "type": "/film/actor",
    "id": "/en/al_pacino",
    "name": "Al Pacino"
  }, {
    "type": "/film/actor",
    "id": "/en/carmen_maura",
    "name": "Carmen Maura"
  }, {
    "type": "/film/actor",
    "id": "/en/luis_hostalot",
    "name": "Luis Hostalot"
  }, {
    "type": "/film/actor",
    "id": "/en/veronica_forque",
    "name": "Veronica Forqu\u00e9"
  }, {
    "type": "/film/actor",
    "id": "/en/hume_cronyn",
    "name": "Hume Cronyn"
  }, {
    "type": "/film/actor",
    "id": "/en/jessica_tandy",
    "name": "Jessica Tandy"
  }, {
    "type": "/film/actor",
    "id": "/en/frank_mcrae",
    "name": "Frank McRae"
  }, {
    "type": "/film/actor",
    "id": "/en/elizabeth_pena",
    "name": "Elizabeth Pe\u00f1a"
  }, {
    "type": "/film/actor",
    "id": "/en/dennis_boutsikaris",
    "name": "Dennis Boutsikaris"
  }, {
    "type": "/film/actor",
    "id": "/en/hal_warren",
    "name": "Hal Warren"
  }, {
    "type": "/film/actor",
    "id": "/en/tom_neyman",
    "name": "Tom Neyman"
  }, {
    "type": "/film/actor",
    "id": "/en/john_reynolds_1941",
    "name": "John Reynolds"
  }, {
    "type": "/film/actor",
    "id": "/en/rajnikanth",
    "name": "Rajnikanth"
  }, {
    "type": "/film/actor",
    "id": "/en/sridevi_kapoor",
    "name": "Sridevi Kapoor"
  }, {
    "type": "/film/actor",
    "id": "/en/kantimathi",
    "name": "Kantimathi"
  }, {
    "type": "/film/actor",
    "id": "/en/konkona_sen_sharma",
    "name": "Konkona Sen Sharma"
  }, {
    "type": "/film/actor",
    "id": "/en/shabana_azmi",
    "name": "Shabana Azmi"
  }, {
    "type": "/film/actor",
    "id": "/en/soumitra_chatterjee",
    "name": "Soumitra Chatterjee"
  }, {
    "type": "/film/actor",
    "id": "/en/waheeda_rehman",
    "name": "Waheeda Rehman"
  }, {
    "type": "/film/actor",
    "id": "/en/rahul_bose",
    "name": "Rahul Bose"
  }, {
    "type": "/film/actor",
    "id": "/en/william_hopper",
    "name": "William Hopper"
  }, {
    "type": "/film/actor",
    "id": "/en/joan_taylor",
    "name": "Joan Taylor"
  }, {
    "type": "/film/actor",
    "id": "/en/frank_puglia",
    "name": "Frank Puglia"
  }, {
    "type": "/film/actor",
    "id": "/en/james_garner",
    "name": "James Garner"
  }, {
    "type": "/film/actor",
    "id": "/en/rod_taylor_1930",
    "name": "Rod Taylor"
  }, {
    "type": "/film/actor",
    "id": "/en/eva_marie_saint",
    "name": "Eva Marie Saint"
  }, {
    "type": "/film/actor",
    "id": "/en/paul_walker",
    "name": "Paul Walker"
  }, {
    "type": "/film/actor",
    "id": "/en/eva_mendes",
    "name": "Eva Mendes"
  }, {
    "type": "/film/actor",
    "id": "/en/devon_aoki",
    "name": "Devon Aoki"
  }, {
    "type": "/film/actor",
    "id": "/en/john_payne_1912",
    "name": "John Payne"
  }, {
    "type": "/film/actor",
    "id": "/en/evelyn_keyes",
    "name": "Evelyn Keyes"
  }, {
    "type": "/film/actor",
    "id": "/en/brad_dexter",
    "name": "Brad Dexter"
  }, {
    "type": "/film/actor",
    "id": "/en/frank_faylen",
    "name": "Frank Faylen"
  }, {
    "type": "/film/actor",
    "id": "/en/peggie_castle",
    "name": "Peggie Castle"
  }, {
    "type": "/film/actor",
    "id": "/en/jean-hugues_anglade",
    "name": "Jean-Hugues Anglade"
  }, {
    "type": "/film/actor",
    "id": "/en/beatrice_dalle",
    "name": "B\u00e9atrice Dalle"
  }, {
    "type": "/film/actor",
    "id": "/en/vincent_lindon",
    "name": "Vincent Lindon"
  }, {
    "type": "/film/actor",
    "id": "/en/dominique_pinon",
    "name": "Dominique Pinon"
  }, {
    "type": "/film/actor",
    "id": "/en/joaquin_phoenix",
    "name": "Joaquin Phoenix"
  }, {
    "type": "/film/actor",
    "id": "/en/james_gandolfini",
    "name": "James Gandolfini"
  }, {
    "type": "/film/actor",
    "id": "/en/catherine_keener",
    "name": "Catherine Keener"
  }, {
    "type": "/film/actor",
    "id": "/en/norman_reedus",
    "name": "Norman Reedus"
  }, {
    "type": "/film/actor",
    "id": "/en/dean_martin",
    "name": "Dean Martin"
  }]
}

同样,你会这样做:

https://www.googleapis.com/freebase/v1/mqlread?query=[{%22type%22:%22/film/film%22,%22id%22:null,%22name%22:null}]

获取电影片名。