如何使用xlwt将python数据导出到Excel?

时间:2018-08-13 07:11:33

标签: python web-scraping beautifulsoup request xlwt

这是我的毛病:

from bs4 import BeautifulSoup
import requests

url = 'http://www.baseballpress.com/lineups'

soup = BeautifulSoup(requests.get(url).text, 'html.parser')

for names in soup.find_all(class_="players"):
    print(names.text) 

我想使用xlwt将抓取文件导入到excel。我在下面使用此代码查看是否可以使用python制作excel工作表:

import xlwt  

wb = xlwt.Workbook()  
ws = wb.add_sheet("Batters")  
ws.write(0,0,"coding isn't easy")  
wb.save("myfirst_xlwt")

上面的代码有效。我现在想将其应用于我的原始拼版。如何合并这两个代码?

我是新来的,因此任何帮助将不胜感激。谢谢你的时间! =)

2 个答案:

答案 0 :(得分:2)

我试图运行您的代码,但是找不到类example的任何东西。它返回[]

关于xlwt,基本上,它只是使用您指定的字符串写一个单元格(带有row和column参数)。

wb = xlwt.Workbook() 
ws = wb.add_sheet('sheet_name')
ws.write(0,0,"content") #Writes the first row, first col, in sheet called "sheet_name".
wb.save("example.xls")  

但是,我认为pandas对此更好。如果您忘记行号和列号,xlwt有时会变得很混乱。如果您可以提供一些非空的结果,我可以编写一个简单的脚本供您使用熊猫导出到Excel。

为了使用pandas作为示例,下面是代码。

from bs4 import BeautifulSoup
import requests

url = 'http://www.baseballpress.com/lineups'

soup = BeautifulSoup(requests.get(url).text, 'html.parser')

all_games = []

for g in soup.find_all(class_="game"):
    players = g.find_all('a', class_='player-link')
    game = {
        'time': g.find(class_='game-time').text,
        'weather': g.find(target='forecast').text.strip(),
        'players': [_.text for _ in g.find_all('a', class_='player-link')],
    }
    all_games.append(game)

print(all_games) # This will print out a list of dict that contains the game information

import pandas as pd
df = pd.DataFrame.from_dict(all_games) # Construct dataframe from the list of dict
writer = pd.ExcelWriter('baseball.xlsx') # Init Pandas excel writer, using the file name 'baseball.xlsx'
df.to_excel(writer, 'baseball_sheet') # Writes to a sheet called 'baseball_sheet'. Format follows the Dataframe format.
writer.save() # Save excel

答案 1 :(得分:1)

合并摘要的最简单方法是在您有ws.write语句的任何地方使用print。您可以使用enumerate来跟踪行索引:

from bs4 import BeautifulSoup
import requests
import xlwt  

wb = xlwt.Workbook()  
ws = wb.add_sheet("Batters")  

url = 'http://www.baseballpress.com/lineups'

soup = BeautifulSoup(requests.get(url).text, 'html.parser')

for row, name in enumerate(soup.find_all(class_="players")):
    ws.write(row, 0, name.text)
wb.save("myfirst_xlwt")