我的任务是从表中抓取第一条和最后一条记录并保存到 excel。 预期结果如下: '07-28 03:17', '3.90', '1.97', '2.75' '07-29 18:41', '3.90', '1.97', '2.75'
代码如下:
import pandas as pd
import datetime
import requests
from bs4 import BeautifulSoup
url = ('https://g10oal.com/match/116539/odds')
r = requests.get(url)
data = BeautifulSoup(r.text, 'lxml')
fha = data.findAll('table')[1] #半場主客和
file2 = open("c:/logs/link/G10oal-fha.txt","a+")
rows = fha.find_all('tr')
for row in rows:
cols=row.find_all('td')
cols=[x.text.strip() for x in cols]
print(cols)
file2.write(str(cols))
file2.close()
================================================ ============================== 问题解决如下编码:
data = BeautifulSoup(r.content, 'html.parser')
fha = data.findAll('table')[1] #半場主客和
file3 = open("c:/logs/history/HKJC-FHA-2021-" + mth + day + ".txt","a+")
rows = fha.find_all('tr')
check_row = (len(rows))
early_row = check_row - 1
early_row = rows[early_row].text.split()
last_row = rows[1].text.split()
result = early_row, last_row
print(link)
print(result)
file3.write(str(result) + "\n")
file3.close()
答案 0 :(得分:0)
也许可以将您的数据放入一个列表中,然后只获取第一个和最后一个?
data = BeautifulSoup(r.text, 'lxml')
result = [res for res in data.select('body > div.container.match-odds > div.match-info-header > div > div:nth-child(2) > div > p')]
print("First: %s\nLast: %s" % (result[0], restult[-1]))