我尝试过做string.replace(“'”,“H”),但这会返回错误:
AttributeError:'list'对象没有属性'replace'
我也可以做re.sub但这会产生类似的错误
我可能找到了解决问题的方法:
25 July 2015
Scottish Football
East Stirling 2 - Stenhousemuir 3
[u" Donaldson 30' ", u" McKenna 77' "]
[u" Stirling 35', 45' ", u" McMenamin 59' "]
我的输出是上面的,如何从外面删除[u“]然后替换'用H代表顶行,A代替第二行?
我正在尝试生成底部的2行,如下所示
25 July 2015
Scottish Football
East Stirling 2 - Stenhousemuir 3
30H, 77H,
35A, 45A, 59A,
然后从文本中删除所有名称
import requests
from bs4 import BeautifulSoup
import csv
import re
from collections import OrderedDict
def parse_page(data):
subsoup = BeautifulSoup(data)
rs = requests.get("http://www.bbc.co.uk/sport/0/football/33578498")
ssubsoup = BeautifulSoup(rs.content)
matchoverview = subsoup.find('div', attrs={'id':'match-overview'})
print '--------------'
date = ssubsoup.find('div', attrs={'id':'article-sidebar'}).findNext('span').text
league = ssubsoup.find('a', attrs={'class':'secondary-nav__link'}).findNext('span').findNext('span').text
#HomeTeam info printing
homeTeam = matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('a').text
homeScore = matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('span').text
homeGoalScorers = []
for goals in matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('p').find_all('span'):
homeGoalScorers.append(goals.text.replace(u'\u2032', "'"))
homeGoals = homeGoalScorers
#AwayTeam info printing
awayTeam = matchoverview.find('div', attrs={'id': 'away-team'}).find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('a').text
awayScore = matchoverview.find('div', attrs={'id': 'away-team'}).find('div', attrs={'class':'team-match-details'}).findNext('span').findNext('span').text
awayGoalScorers = []
for goals in matchoverview.find('div', attrs={'id': 'away-team'}).find('div', attrs={'class':'team-match-details'}).findNext('p').find_all('span'):
awayGoalScorers.append(goals.text.replace(u'\u2032', "'"))
awayGoals = awayGoalScorers
#Printouts
print date
print league
print '{0} {1} - {2} {3}'.format(homeTeam, homeScore, awayTeam, awayScore)
print homeGoals
print awayGoals
if len(homeTeam) >1:
with open('score.txt', 'a') as f:
writer = csv.writer(f)
writer.writerow([league,date,homeTeam,awayTeam])
def all_league_results():
r = requests.get("http://www.bbc.co.uk/sport/football/league-one/results")
soup = BeautifulSoup(r.content)
# Save Teams
for link in soup.find_all("a", attrs={'class': 'report'}):
fullLink = 'http://www.bbc.com' + link['href']
subr = requests.get(fullLink)
parse_page(subr.text)
def specific_game_results(url):
subr = requests.get(url)
parse_page(subr.text)
#get specific games results
specific_game_results('http://www.bbc.co.uk/sport/0/football/33578498')
答案 0 :(得分:0)
我相信你可以在这里更改代码
for goals in matchoverview.find('div', attrs={'class':'team-match-details'}).findNext('p').find_all('span'):
homeGoalScorers.append(goals.text.replace(u'\u2032', "'") +'H')
homeGoals = ",".join(homeGoalScorers)
删除homeGoals = "H".join(homeGoalScorers)