Question

我正试图通过遍历所有不同的页面来搜索职业搜索网站，当我尝试使用for循环将字典附加到列表中时，我一直遇到问题。当我在Python 3.4中执行下面的代码时，代码会将每个页面中的所有相关数据拉入一个字典（我用print（）检查）并附加到“FullJobDetails”中，但是在for循环结束时我从最后一页获取一个满是字典的列表。字典数量与“ListofJobs”列表中的页数完全相同。 “ListofJobs”是我要删除的每个页面的html链接列表。

我刚刚开始学习代码，所以我知道下面的代码不是任何形状，方式，或形成最有效或最好的方式。任何建议，将不胜感激。提前谢谢！

FullJobDetails = []
browser = webdriver.Chrome()
dictionary = {}

for jobs in ListofJobs:
  browser.get(jobs)
  dictionary["Web Page"] = jobs
  try:
    dictionary["Views"] = browser.find_element_by_class_name('job-viewed-item-count').text
  except NoSuchElementException:
    dictionary["Views"] = 0

  try:
    dictionary['Applicants'] = browser.find_element_by_class_name('job-applied-item-count').text
  except NoSuchElementException:
    dictionary["Applicants"] = 0

  try:
    dictionary["Last Application"] = browser.find_element_by_class_name('last-application-time-digit').text
  except NoSuchElementException:
    dictionary["Last Application"] = "N/A"

  try:
    dictionary["Job Title"] = browser.find_element_by_class_name('title').text
  except NoSuchElementException:
    dictionary["Job Title"] = "N/A"

  try:
    dictionary['Company'] = browser.find_element_by_xpath('/html/body/div[3]/article/section[2]/div/ul/li[4]/span/span').text
  except NoSuchElementException:
    dictionary['Company'] = "Not found"

  try:
    dictionary['Summary'] = browser.find_element_by_class_name('summary').text
  except NoSuchElementException:
    dictionary['Summary'] = "Not found"

  FullJobDetails.append(dictionary)

Answer 1

问题是你只创建一个字典 - dicitonaries是可变对象 - 相同的ditionary会一遍又一遍地附加到你的列表中，并且在for循环的每次传递中你都会更新它的内容。因此，最后，您将拥有同一个数字的多个副本，所有副本都显示在最后一页上的信息。

只需为for循环的每次运行创建一个新的字典对象。该新字典将保存在列表中，变量名dictionary可以保存您的新对象而不会发生冲突。

for jobs in ListofJobs:
  dictionary = {} 
  browser.get(jobs)
  ...

通过for循环将字典附加到列表时，我只得到最后一个字典

1 个答案: