csv输出中的空格 - Python

时间:2016-04-24 10:17:19

标签: python-2.7 csv

我一直遇到干净的csv输出问题。

以下是该计划:

import csv
import requests
from lxml import html

page = requests.get('http://www.mediamarkt.be/mcs/productlist/_108-tot-127-cm-43-tot-50-,98952,501090.html?langId=-17')
tree = html.fromstring(page.content)

outfile = open("./tv_test1.csv", "wb")
writer = csv.writer(outfile)

rows = tree.xpath('//*[@id="category"]/ul[2]/li')
writer.writerow(["Product Name", "Price"])

for row in rows:
    price = row.xpath('div/aside[2]/div[1]/div[1]/div/text()')
    product_ref = row.xpath('div/div/h2/a/text()')
    writer.writerow([product_ref,price])

outfile.close()

当前输出:

['\r\n\t\t\t\t\tTV SAMSUNG UE48JU6640UXXN 48" LCD FULL LED Smart Ultra HD Curved\r\n\t\t\t\t'],"['999,-']"

必需的输出:

TV SAMSUNG UE48JU6640UXXN 48" LCD FULL LED Smart Ultra HD Curve,999,-

2 个答案:

答案 0 :(得分:0)

找到它:

import csv
import requests
from lxml import html

page =
requests.get('http://www.mediamarkt.be/mcs/productlist/_108-tot-127-cm-43-tot-50-,98952,501090.html?langId=-17')
tree = html.fromstring(page.content)

outfile = open("./tv_test1.csv", "wb") writer = csv.writer(outfile)

rows = tree.xpath('//*[@id="category"]/ul[2]/li')
writer.writerow(["Product Name", "Price"])

for row in rows:
    price = row.xpath('normalize-space(div/aside[2]/div[1]/div[1]/div/text())')
    product_ref = row.xpath('normalize-space(div/div/h2/a/text())')
    writer.writerow([product_ref,price])

outfile.close()

答案 1 :(得分:0)

您可以在将数据写入csv文件之前删除\n\r\t

import csv
import requests
from lxml import html

page = requests.get('http://www.mediamarkt.be/mcs/productlist/_108-tot-127-cm-43-tot-50-,98952,501090.html?langId=-17')
tree = html.fromstring(page.content)

outfile = open("./tv_test1.csv", "wb")
writer = csv.writer(outfile)

rows = tree.xpath('//*[@id="category"]/ul[2]/li')
writer.writerow(["Product Name", "Price"])

for row in rows:
    price = row.xpath('div/aside[2]/div[1]/div[1]/div/text()')
    for i in range(len(price)):
        price[i]= price[i].replace("\n","")
        price[i]= price[i].replace("\t","")
        price[i]= price[i].replace("\r","")

    product_ref = row.xpath('div/div/h2/a/text()')
    for i in range(len(product_ref)):
        product_ref[i]= product_ref[i].replace("\n","")
        product_ref[i]= product_ref[i].replace("\t","")
        product_ref[i]= product_ref[i].replace("\r","")
    if len(product_ref) and len(price):
        writer.writerow([product_ref,price])

outfile.close()

你将拥有:

enter image description here

请注意,在将priceproduct_ref存储到文件中之前,我还检查了 <link rel="stylesheet" href="~/path/yourcssfile.css" /> respond_to do |format| format.ini do response.headers['Content-Disposition'] = "attachment; filename=somefile.ini" render ini: SomeClass.make_ini(data) end end 的长度。