我正在使用python模块请求抓取该网站(https://www.ivolatility.com/options/RVX/)。以上是使用beautifulsoap选择第一张表的输出。现在,在第一个表中,我试图从从python模块请求中获得的汤中获取特定值(19.17)。
我想使用Beautifulsoap实现它,我不知道如何特别选择保存它的单元格。
你们有什么建议吗?
请求的输出:
<table border="0" bordercolor="red" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td colspan="3"><script language="JavaScript">
function submitCalcForm(event) {
event.preventDefault();
var form = document.getElementById('basicOptionsForm');
var action = form.action;
var regions = ['', 'USA', 'Europe', 'Asia', 'Canada'];
var regionsOptions = form[1];
var selectedRegion = regionsOptions.options[regionsOptions.selectedIndex].value;
var symbol = form[0].value.trim();
var location = (window.location.href.indexOf('.j')>-1)
? (form.action + '?' + form[0].name + '=' + form[0].value + '&' + form[1].name + '=' + selectedRegion)
: ('/options/'+ ((symbol == '') ? '-' : symbol ) +'/'+regions[selectedRegion]);
window.location.href= location;
}
function goToLookup() {
window.location.href= "/options/-/";
}
</script>
<form action="/options.j" id="basicOptionsForm" method="get" onsubmit="submitCalcForm(event);">
<table bgcolor="#ffffff" border="0" cellpadding="0" cellspacing="0">
<tr>
<td>
<table bgcolor="#999999" border="0" cellpadding="0" cellspacing="1">
<tr>
<td bgcolor="#567abb">
<table border="0" cellpadding="1" cellspacing="0" class="table-action">
<tr>
<td><span class="s1w" style="color: #fff;"> Symbol: </span></td><td><input class="s2" name="ticker" size="5" type="text" value="RVX"/></td><td><select class="s2" name="R"><option selected="" value="0">
ALL
</option><option value="1">
USA
</option><option value="2">
Europe
</option><option value="4">
Canada
</option></select></td><td><span class="s2"> </span></td><td><button style="background: #0C6EF8; font-weight: bold; border: 1px solid black;" type="submit">GO!</button></td><td><span class="s2"> </span></td><td><button onclick="goToLookup();" style="background: #0C6EF8; font-weight: bold; border: 1px solid black; color: white; white-space: nowrap;" type="button">
Symbol Lookup</button></td><td><span class="s2"> </span></td>
</tr>
</table>
</td>
</tr>
</table>
</td><td><img border="0" height="1" src="/design/images/0.gif" width="5"/></td><td nowrap=""><b><span class="s4">Russell 2000 Volatility Index</span></b></td><td width="100%"> </td>
</tr>
</table>
</form>
</td>
</tr>
<tr>
<td colspan="3"><img alt="." border="0" height="10" src="/design/images/0.gif" width="1"/></td>
</tr>
<tr valign="top">
<td width="100%"><script type="text/javascript">
<!--
function wr(s) {
document.write(s);
}
var d = new Array(10);
d[20]='N/A';d[25]='-94.06%';d[30]='32.03%';d[35]='34.74';d[56]='N/A';d[61]='N/A';d[66]='10-Apr';d[71]='84.49%';d[97]='N/A';d[102]='03-Oct';d[107]='29-Mar';d[112]='1.43';d[133]='N/A';d[138]='N/A';d[143]='148.97%';d[148]='98.46%';d[174]='N/A';d[179]='-46.88%';d[184]='198.21%';d[189]='0.27';d[210]='N/A';d[215]='N/A';d[220]='25-May';d[225]='110.30%';d[251]='N/A';d[256]='-68.76%';d[261]='75.38%';d[266]='0';d[287]='N/A';d[292]='N/A';d[297]='39.85%';d[302]='120.02%';d[328]='N/A';d[333]='-67.09%';d[338]='69.94%';d[343]='19.17';d[364]='N/A';d[369]='N/A';d[374]='06-Apr';d[379]='06/14/2018';d[405]='N/A';d[410]='-82.49%';d[415]='74.41%';d[441]='N/A';d[446]='N/A';d[451]='164.16%';d[456]='12.93';d[482]='N/A';d[487]='24-May';d[492]='77.70%';d[518]='N/A';d[523]='03-May';d[528]='21-May';d[533]='12/24/2018';d[559]='N/A';d[564]='59.42%';d[569]='84.78%';
wr('<table class="table-data" cellpadding=1 cellspacing=1 border=0 width=100%>');
wr('<tr bgcolor="#cccccc" align=right height=20>');
wr('<td align="center"><font class=s1>Price</font></td>');
wr('<td align="center"><font class=s1>Change (%)</font><img src="/design/images/0.gif" width=4 height=1 border=0/></td>');
wr('<td align="center"><font class=s1>52 wk High</font><img src="/design/images/0.gif" width=4 height=1 border=0/></td>');
wr('<td align="center"><font class=s1>52 wk Low</font><img src="/design/images/0.gif" width=4 height=1 border=0/></td>');
wr('<td align="center"><font class=s1>Stock volume</font>');
wr('<a href="javascript:openHelp(14)" alt="Open Help">');
wr('<img src="/design/images/ico/q_zn.gif" width=8 height=10 border=0 alt="Open Help"/>');
wr('</a><img src="/design/images/0.gif" width=4 height=1 border=0/></td>');
wr('</tr>');
wr('<tr bgcolor="#FFFFFF" align=right height=20>');
wr('<td align="center"><font class=s1>');
wr(d[343]);
wr('</font></td>');
wr('<td align="center"><font class=s1><nobr> ');
wr('<img src="/design/images/ico/up.gif" alt="+" border=0 align="absmiddle" width=7 height=9/> +');
wr(d[189]);
wr(' (+');
wr(d[112]);
wr('%)</nobr></font></td>');
wr('<td align="center"><font class=s1><nobr> ');
wr(d[35]);
wr(' ');
wr(d[533]);
wr('</nobr></font></td><td align="center"><font class=s1><nobr> ');
wr(d[456]);
wr(' ');
wr(d[379]);
wr('</nobr></font></td>');
wr('<td align="center"><font size=-2 class=s1>');
wr(d[266]);
wr('</font></td>');
wr('</tr></table>');
//-->
</script><img border="0" height="10" src="/design/images/0.gif" width="1"/><table border="0" cellpadding="0" cellspacing="0" class="table-data" width="100%">
<tr align="center" bgcolor="
#cccccc
" height="20">
<td align="center" colspan="2"><font class="s2">Current</font></td><td><font class="s2">1 WK AGO</font></td><td><font class="s2">1 MO AGO</font></td><td><font class="s2">52 wk Hi/Date</font></td><td><font class="s2">52 wk Low/Date</font></td>
</tr>
<tr>
<td align="center" bgcolor="
#FFFFFF
" colspan="5" height="20"><font class="s2" color=""> HISTORICAL VOLATILITY <a alt="Open Help" href="javascript:openHelp(4)"><img alt="Open Help" border="0" height="10" src="/design/images/ico/q_zn.gif" width="8"/></a></font></td>
</tr>
<tr align="center" bgcolor="#ffffff">
<td align="right"><font class="s2">10 days</font></td><td><font class="s2">120.02%</font></td><td><font class="s2">84.49%</font></td><td><font class="s2">74.41%</font></td><td><font class="s2">198.21% - 29-Mar</font></td><td><font class="s2">32.03% - 21-May</font></td>
</tr>
<tr align="center" bgcolor="#eeeeee">
<td align="right"><font class="s2">20 days</font></td><td><font class="s2">110.30%</font></td><td><font class="s2">84.78%</font></td><td><font class="s2">69.94%</font></td><td><font class="s2">164.16% - 06-Apr</font></td><td><font class="s2">39.85% - 25-May</font></td>
</tr>
<tr align="center" bgcolor="#ffffff">
<td align="right"><font class="s2">30 days</font></td><td><font class="s2">98.46%</font></td><td><font class="s2">77.70%</font></td><td><font class="s2">75.38%</font></td><td><font class="s2">148.97% - 10-Apr</font></td><td><font class="s2">59.42% - 24-May</font></td>
</tr>
<tr>
<td align="center" bgcolor="
#FFFFFF
" colspan="5" height="20"><font class="s2" color=""> IMPLIED VOLATILITY <a href="javascript:openHelp(12)"><img alt="Open Help" border="0" height="10" src="/design/images/ico/q_zn.gif" width="8"/></a></font></td>
</tr>
<tr align="center" bgcolor="#ffffff">
<td align="right"><font class="s2">IV Index call <a href="javascript:openHelp(9)"><img alt="Open Help" border="0" height="10" src="/design/images/ico/q_zn.gif" width="8"/></a></font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A - N/A</font></td><td><font class="s2">N/A - N/A</font></td>
</tr>
<tr align="center" bgcolor="#eeeeee">
<td align="right"><font class="s2">IV Index put <a href="javascript:openHelp(10)"><img alt="Open Help" border="0" height="10" src="/design/images/ico/q_zn.gif" width="8"/></a></font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A - N/A</font></td><td><font class="s2">N/A - N/A</font></td>
</tr>
<tr align="center" bgcolor="#ffffff">
<td align="right"><font class="s2">IV Index mean <a href="javascript:openHelp('ivxmean')"><img alt="Open Help" border="0" height="10" src="/design/images/ico/q_zn.gif" width="8"/></a></font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A</font></td><td><font class="s2">N/A - N/A</font></td><td><font class="s2">N/A - N/A</font></td>
</tr>
<tr>
<td align="center" bgcolor="
#FFFFFF
" colspan="5" height="20"><font class="s2" color="">HISTORICAL 30-DAYS CORRELATION AGAINST S&P 500 Index (SPX)<a href="javascript:openHelp(30)"><img alt="Open Help" border="0" height="10" src="/design/images/ico/q_zn.gif" width="8"/></a></font></td>
</tr>
<tr align="center" bgcolor="#ffffff">
<td align="right"><font class="s2">30 days</font></td><td><font class="s2">-82.49%</font></td><td><font class="s2">-67.09%</font></td><td><font class="s2">-68.76%</font></td><td><font class="s2">-46.88% - 03-Oct</font></td><td><font class="s2">-94.06% - 03-May</font></td>
</tr>
</table>
</td>
</tr>
</table>
答案 0 :(得分:0)
页面是动态的,因此您首先需要使用Selenium之类的东西来呈现页面。
此外,一旦拥有html,就可以使用BeautfifulSoup甚至Selenium来对其进行解析。但是我注意到它位于<table>
标记内。每当我看到<table>
标签时,我通常都会选择使用熊猫的.read_html()
,因为它会为您完成辛苦的工作。
.read_html()
将返回数据帧列表,然后只需查找所需的数据,或根据需要操作表即可。在数据框中的索引位置4
中找到了所需的数据(它也在位置0
中,但是我选择使用4
,因为它就在第二行,第一列)。然后只需对该数据帧进行切片即可获得帽子特定的单元格:
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
url = 'https://www.ivolatility.com/options/RVX/'
driver.get(url)
tables = pd.read_html(driver.page_source)
price = tables[4][0][1]
driver.close()
输出:
print (price)
19.17