为什么我的已删除Excel文件中有未打开的单元格?

时间:2018-03-28 06:21:10

标签: python excel python-3.x beautifulsoup

我正在尝试使用Python和bs4将开发人员的工作从really.nl抓到Excel。一切正常,但是当我在Excel中打开它时,作业之间会有额外的行单元格 Excel file

谁能看到我做错了什么?

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.indeed.nl/jobs?q=developer&l='

# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

#grabs each job
containers = page_soup.findAll("div",{"class":"row"})

filename = "indeedjobs.csv"

f = open(filename, "w")

headers = "Company; Job; City\n"
f.write(headers)

for container in containers:
    jobtitle = container.a["title"]
    city_container = container.findAll("span",{"class":"location"})
    City_name = city_container[0].text
    company_container = container.findAll("span",{"class":"company"})
    company_name = company_container[0].text

    print("Company: " + company_name)
    print("Job: " + jobtitle)
    print("City: " + City_name)

    f.write(company_name + ";" + jobtitle + ";" + City_name + "\n")
f.close()

1 个答案:

答案 0 :(得分:2)

<span class="company">元素以换行符和一些空格开头。删除.strip()

您还可以考虑csv module来编写格式正确的CSV文件。该模块将帮助您正确转义特殊字符。