这是我的毛病:
from bs4 import BeautifulSoup
import requests
url = 'http://www.baseballpress.com/lineups'
soup = BeautifulSoup(requests.get(url).text, 'html.parser')
for names in soup.find_all(class_="players"):
print(names.text)
我想使用xlwt将抓取文件导入到excel。我在下面使用此代码查看是否可以使用python制作excel工作表:
import xlwt
wb = xlwt.Workbook()
ws = wb.add_sheet("Batters")
ws.write(0,0,"coding isn't easy")
wb.save("myfirst_xlwt")
上面的代码有效。我现在想将其应用于我的原始拼版。如何合并这两个代码?
我是新来的,因此任何帮助将不胜感激。谢谢你的时间! =)
答案 0 :(得分:2)
我试图运行您的代码,但是找不到类example
的任何东西。它返回[]
。
关于xlwt
,基本上,它只是使用您指定的字符串写一个单元格(带有row和column参数)。
wb = xlwt.Workbook()
ws = wb.add_sheet('sheet_name')
ws.write(0,0,"content") #Writes the first row, first col, in sheet called "sheet_name".
wb.save("example.xls")
但是,我认为pandas
对此更好。如果您忘记行号和列号,xlwt
有时会变得很混乱。如果您可以提供一些非空的结果,我可以编写一个简单的脚本供您使用熊猫导出到Excel。
为了使用pandas
作为示例,下面是代码。
from bs4 import BeautifulSoup
import requests
url = 'http://www.baseballpress.com/lineups'
soup = BeautifulSoup(requests.get(url).text, 'html.parser')
all_games = []
for g in soup.find_all(class_="game"):
players = g.find_all('a', class_='player-link')
game = {
'time': g.find(class_='game-time').text,
'weather': g.find(target='forecast').text.strip(),
'players': [_.text for _ in g.find_all('a', class_='player-link')],
}
all_games.append(game)
print(all_games) # This will print out a list of dict that contains the game information
import pandas as pd
df = pd.DataFrame.from_dict(all_games) # Construct dataframe from the list of dict
writer = pd.ExcelWriter('baseball.xlsx') # Init Pandas excel writer, using the file name 'baseball.xlsx'
df.to_excel(writer, 'baseball_sheet') # Writes to a sheet called 'baseball_sheet'. Format follows the Dataframe format.
writer.save() # Save excel
答案 1 :(得分:1)
合并摘要的最简单方法是在您有ws.write
语句的任何地方使用print
。您可以使用enumerate
来跟踪行索引:
from bs4 import BeautifulSoup
import requests
import xlwt
wb = xlwt.Workbook()
ws = wb.add_sheet("Batters")
url = 'http://www.baseballpress.com/lineups'
soup = BeautifulSoup(requests.get(url).text, 'html.parser')
for row, name in enumerate(soup.find_all(class_="players")):
ws.write(row, 0, name.text)
wb.save("myfirst_xlwt")