为一组特定的'td'
标签和内部文本刮取表格。为了过滤刮我,我定位了一个特定的'img'
代码并尝试使用previousSibling
来调用我想要的'td'
。我已尝试previousSibling
,previous_sibling
,previous
并继续收到错误:
'结果集'对象没有属性' previousSibling'
任何帮助将不胜感激。
到目前为止,这是我的代码。
from urllib2 import urlopen
import requests
from bs4 import BeautifulSoup
base_url = 'http://www.myfxbook.com/forex-economic-calendar'
response = urlopen(base_url)
html = response
soup = BeautifulSoup(html.read().decode('utf-8'), "lxml")
table = soup.find('table', attrs={'class': 'table center td30'})
is_row = table.findAll('img', attrs={'class': 'sprite sprite-common sprite-high-impact'}).previousSibling('td').text
print is_row
答案 0 :(得分:1)
您搜索的图片没有siblings
。你想要的(我认为)是让PARENT以前的兄弟形象。
示例:
from bs4 import BeautifulSoup
import requests
base_url = 'http://www.myfxbook.com/forex-economic-calendar'
response = requests.get(base_url)
soup = BeautifulSoup(response.content.decode('utf-8'), "html.parser")
table = soup.find('table', attrs={'class': 'table center td30'})
is_row = table.findAll('img', attrs={'class': 'sprite sprite-common
sprite-high-impact'})
for row in is_row:
print (row.parent.find_previous_sibling("td").get_text(strip=True))
哪个输出:
Fed's Yellen Speech
FOMC Member Kashkari Speech
BOE's Governor Carney speech
Claimant Count Change
BOC Rate Statement
BoC Interest Rate Decision
Bank of Canada Monetary Policy Report
BoC Press Conference