格式化Python生成的CSV

时间:2019-01-30 10:56:56

标签: python-3.x

我正在用python制作网页抓取工具。

我想从生成的csv中删除空白行,并想添加标题为“ Car make”,“ Car Model”,“ Price”的标头。并且还想从生成的csv中的所有名称中删除[]。

    imports go here...

    source = requests.get(' website link goes here...').text
    soup = bs(source, 'html.parser')

    csv_file = open('pyScraper_1.3_Export', 'w')
    csv_writer = csv.writer(csv_file)
    csv_writer.writerow(['brand_Names', 'Prices'])
    csv_file.close()

    #gives us the make and model of all cars
Names = []
Prices_Cars = []
for var1 in soup.find_all('h3', class_ = 'brandModelTitle'):
    car_Names = var1.text # var1.span.text
    test_Split = car_Names.split("\n")
    full_Names = test_Split[1:3]
    #make = test_Split[1:2]
    #model = test_Split[2:3]
    Names.append(full_Names)

    #prices
    for Prices in soup.find_all('span', class_ = 'f20 bold fieldPrice'):
        Prices = Prices.span.text
        Prices = re.sub("^\s+|\s+$", "",Prices, flags=re.UNICODE) # removing whitespace before the prices
        Prices_Cars.append(Prices)

    csv_file = open('pyScraper_1.3_Export.csv', 'a')
    csv_writer = csv.writer(csv_file)
    i = 0
    while i < len(Prices_Cars):
        csv_writer.writerow([Names[i], Prices_Cars[i]])
        i = i + 1
    csv_file.close()

here is the screenshot of the generated csv

![][1]


[1]: https://i.stack.imgur.com/m7Xw1.jpg

1 个答案:

答案 0 :(得分:0)

要删除其他换行符:

csv_file = open('pyScraper_1.3_Export.csv', 'a', newline='')

(“如果csvfile是文件对象,则应使用newline =来打开它。”,https://docs.python.org/3/library/csv.html#csv.writer

要添加标题: 您实际上是在添加标头,但是对于名为pyScraper_1.3_Export的文件(注意扩展名为.csv),这可能是错误的类型。只需将第6行的代码更改为

    csv_file = open('pyScraper_1.3_Export.csv', 'w', newline='')
    csv_writer = csv.writer(csv_file)
    csv_writer.writerow(["Car make", "Car Model", "Price"])
    csv_file.close()

要删除嵌套列表,请使用Names[i]运算符解压缩*

csv_writer.writerow([*Names[i], Prices_Cars[i]])