我正在网上搜索http://m.imdb.com/feature/bornondate我认为我的代码之前正在运行,但今天却给了我错误:
“FileError:解压缩的值太多了” “/Users/Desktop/IMDB_BornToday_Scraping.py”,第28行,in 专业,bestWork = person.split(“,”)
(每日更改页面)这里有什么问题?
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import csv
c = csv.writer(open("celebritiesBornToday.csv", "wb"))
driver = webdriver.PhantomJS(executable_path='/Users/Downloads/phantomjs-2.0.0-macosx/bin/phantomjs')
driver.get("http://m.imdb.com/feature/bornondate")
# waiting for posters to load
wait = WebDriverWait(driver, 10)
posters = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "section.posters")))
#giving headings to the columns of the csv file
c.writerow(['name','image','profession','bestWork'])
# extracting the data poster by poster
for a in posters.find_elements_by_css_selector('a.poster'):
# Fetching the picture of the celebrity
image = a.find_element_by_tag_name('img').get_attribute('src').split('._V1.')[0] + '._V1_SX214_AL_.jpg'
# Fetching the profession and bestWork of the celebrity as person
person = a.find_element_by_css_selector('div.detail').text
# Splitting person into profession and bestWork
profession, bestWork = person.split(",")
# Fetching the name of the celebrity
name = a.find_element_by_css_selector('span.title').text
#Printing the Name of the Celebrity
print "Name of the celebrity: "+name
#Printing the Image of the Celebrity
print "Image: "+image
#Printing the Profession of the Celebrity
print "Profession: "+profession
#Printing the BestWork of the Celebrity
print "BestWork: "+bestWork
#saving the name, image, profession, bestWork of the celebrity into a csv file
c.writerow([name,image,profession,bestWork])
答案 0 :(得分:1)
问题是第一个人,出生于3月16日,最佳作品标题包含一个逗号:
Actor, "I, Robot"
这会导致ValueError
,而拆分时会解压缩,演示:
>>> s = 'Actor, "I, Robot"'
>>> profession, bestWork = s.split(',')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
您需要首次出现以逗号分隔:
profession, bestWork = person.split(",", 1)
演示:
>>> profession, bestWork = s.split(',', 1)
>>> profession
'Actor'
>>> bestWork
' "I, Robot"'