使用Selenium的Python,ValueError:解压缩的值太多了

时间:2015-03-15 18:04:16

标签: python selenium selenium-webdriver web-scraping

我正在网上搜索http://m.imdb.com/feature/bornondate我认为我的代码之前正在运行,但今天却给了我错误:

  

“FileError:解压缩的值太多了”   “/Users/Desktop/IMDB_BornToday_Scraping.py”,第28行,in       专业,bestWork = person.split(“,”)

(每日更改页面)这里有什么问题?

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import csv

c = csv.writer(open("celebritiesBornToday.csv", "wb"))
driver = webdriver.PhantomJS(executable_path='/Users/Downloads/phantomjs-2.0.0-macosx/bin/phantomjs')
driver.get("http://m.imdb.com/feature/bornondate")

# waiting for posters to load
wait = WebDriverWait(driver, 10)
posters = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "section.posters")))

#giving headings to the columns of the csv file
c.writerow(['name','image','profession','bestWork'])

# extracting the data poster by poster
for a in posters.find_elements_by_css_selector('a.poster'):

    # Fetching the picture of the celebrity
    image = a.find_element_by_tag_name('img').get_attribute('src').split('._V1.')[0] + '._V1_SX214_AL_.jpg'
    # Fetching the profession and bestWork of the celebrity as person
    person = a.find_element_by_css_selector('div.detail').text
    # Splitting person into profession and bestWork
    profession, bestWork = person.split(",")
    # Fetching the name of the celebrity
    name = a.find_element_by_css_selector('span.title').text

    #Printing the Name of the Celebrity
    print "Name of the celebrity: "+name
    #Printing the Image of the Celebrity
    print "Image: "+image
    #Printing the Profession of the Celebrity
    print "Profession: "+profession
    #Printing the BestWork of the Celebrity
    print "BestWork: "+bestWork

    #saving the name, image, profession, bestWork of the celebrity into a csv file
    c.writerow([name,image,profession,bestWork])

1 个答案:

答案 0 :(得分:1)

问题是第一个人,出生于3月16日,最佳作品标题包含一个逗号:

Actor, "I, Robot"

这会导致ValueError,而拆分时会解压缩,演示:

>>> s = 'Actor, "I, Robot"'
>>> profession, bestWork = s.split(',')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack

您需要首次出现以逗号分隔

profession, bestWork = person.split(",", 1)

演示:

>>> profession, bestWork = s.split(',', 1)
>>> profession
'Actor'
>>> bestWork
' "I, Robot"'