无法使用熊猫ExcelWriter在Excel文件中编写表格内容

时间:2019-06-29 22:24:30

标签: python python-3.x pandas web-scraping beautifulsoup

我创建了一个python脚本,以从网页中获取一些表格内容,并使用熊猫ExcelWriter将其写到excel文件中。表格数据正确地通过了,但是我无法将它们写到Excel文件中。我可以使用openpyxl来写同样的东西,但是如果是熊猫ExcelWriter,我会被卡住。

我尝试过:

import requests
import pandas as pd
from bs4 import BeautifulSoup
from pandas import ExcelWriter

link = "https://en.wikipedia.org/wiki/Comparison_of_Intel_processors"
result = []

res = requests.get(link)
soup = BeautifulSoup(res.text,"lxml")
for items in soup.select_one("table.wikitable").select("tr"):
    data = [item.get_text(strip=True) for item in items.select("th,td")]
    print(data)
    result+=data

df = pd.DataFrame(result)
writer = ExcelWriter('tabular_content.xlsx')
df.to_excel(writer,'Sheet1',index=False)
writer.save()

为避免对我得到的东西和希望得到的东西感到困惑,我给出了两个示例来描述图片。

我目前的方法可以将数据写在单列中,如下所示。

Processor
SeriesNomenclature
CodeName
Production Date
Supported Features (Instruction Set)
Clock Rate
Socket
Fabri-cation

但是,我希望这样写它们:

Processor   SeriesNomenclature  CodeName    Production Date Supported Features (Instruction Set)
4004            Nov. 15,1971    
8008    N/A N/A April 1972  N/A
8080    N/A N/A April 1974  N/A
8085    N/A N/A March 1976  N/A
8086    N/A N/A June 8, 1978    N/A
8088    N/A N/A June 1979   N/A
80286   N/A N/A Feb. 1982   N/A
i80386  DX, SX, SL  N/A 1985 - 1990 N/A
i80486  DX, SX, DX2, DX4, SL    N/A 1989 - 1992 N/A

P.S。必须使用ExcelWriter

1 个答案:

答案 0 :(得分:0)

ExcelWriter似乎没有问题,在这种情况下,您甚至不需要BeautifulSoup。只需以这种方式读取数据

    tables = pd.read_html("https://en.wikipedia.org/wiki/Comparison_of_Intel_processors")

    writer = ExcelWriter('tabular_content.xlsx')
    tables[0].to_excel(writer,'Sheet1',index=False)
    writer.save()

而且,至少在我的系统上,它按预期方式创建了Excel文件。