Question

我正在网上搜索http://m.imdb.com/feature/bornondate我认为我的代码之前正在运行，但今天却给了我错误：

“FileError：解压缩的值太多了” “/Users/Desktop/IMDB_BornToday_Scraping.py”，第28行，in 专业，bestWork = person.split（“，”）

（每日更改页面）这里有什么问题？

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import csv

c = csv.writer(open("celebritiesBornToday.csv", "wb"))
driver = webdriver.PhantomJS(executable_path='/Users/Downloads/phantomjs-2.0.0-macosx/bin/phantomjs')
driver.get("http://m.imdb.com/feature/bornondate")

# waiting for posters to load
wait = WebDriverWait(driver, 10)
posters = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "section.posters")))

#giving headings to the columns of the csv file
c.writerow(['name','image','profession','bestWork'])

# extracting the data poster by poster
for a in posters.find_elements_by_css_selector('a.poster'):

    # Fetching the picture of the celebrity
    image = a.find_element_by_tag_name('img').get_attribute('src').split('._V1.')[0] + '._V1_SX214_AL_.jpg'
    # Fetching the profession and bestWork of the celebrity as person
    person = a.find_element_by_css_selector('div.detail').text
    # Splitting person into profession and bestWork
    profession, bestWork = person.split(",")
    # Fetching the name of the celebrity
    name = a.find_element_by_css_selector('span.title').text

    #Printing the Name of the Celebrity
    print "Name of the celebrity: "+name
    #Printing the Image of the Celebrity
    print "Image: "+image
    #Printing the Profession of the Celebrity
    print "Profession: "+profession
    #Printing the BestWork of the Celebrity
    print "BestWork: "+bestWork

    #saving the name, image, profession, bestWork of the celebrity into a csv file
    c.writerow([name,image,profession,bestWork])

Answer 1

问题是第一个人，出生于3月16日，最佳作品标题包含一个逗号：

Actor, "I, Robot"

这会导致ValueError，而拆分时会解压缩，演示：

>>> s = 'Actor, "I, Robot"'
>>> profession, bestWork = s.split(',')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack

您需要首次出现以逗号分隔：

profession, bestWork = person.split(",", 1)

演示：

>>> profession, bestWork = s.split(',', 1)
>>> profession
'Actor'
>>> bestWork
' "I, Robot"'

使用Selenium的Python，ValueError：解压缩的值太多了

1 个答案: