从python中的表中删除数据

时间:2016-12-10 07:58:22

标签: python-2.7 web-scraping beautifulsoup

我想要此链接中提供的表中的所有数据>> enter link description here在Python.Problem中使用Lib BeautifulSoup,这是表的值来自javascript代码。

import requests
from bs4 import BeautifulSoup

url = 'http://www.nseindia.com/live_market/dynaContent/live_analysis/top_gainers_losers.htm?cat=G'
parser = 'html5lib'
s = requests.session()
r = s.get(url)
soup = BeautifulSoup(r.text, parser)
l = soup.table
print l 

对于上面的代码,我得到了

<table cellspacing="0" id="topGainers">
                    <tbody><tr style="width:200px">

                      <th title="Symbol">Symbol</th>
                      <th title="Last Traded Price">LTP<br/></th>

                      <th title="% Change">%<br/>Change</th>

                      <th title="Traded Volume">Traded<br/>Qty</th>                  
                      <th title="Traded Value">Value<br/>(in Lakhs)</th>
                      <th title="Open">Open</th>
                      <th title="High">High</th>
                      <th title="Low">Low</th>
                      <th title="Previous Close">Prev.<br/>Close</th>
                      <th title="Latest Ex Date"><nobr>Latest Ex Date</nobr></th> 
                      <th class="last" style="width:18px;" title="Corporate Action">CA</th>

                    </tr>
                       <script>
                        document.write(tds);
                      </script>
                  </tbody></table>

但我想要这个:

<table id="topGainers" cellspacing="0">
                    <tbody><tr style="width:200px">

                      <th title="Symbol">Symbol</th>
                      <th title="Last Traded Price">LTP<br></th>

                      <th title="% Change">%<br>Change</th>

                      <th title="Traded Volume">Traded<br>Qty</th>                  
                      <th title="Traded Value">Value<br>(in Lakhs)</th>
                      <th title="Open">Open</th>
                      <th title="High">High</th>
                      <th title="Low">Low</th>
                      <th title="Previous Close">Prev.<br>Close</th>
                      <th title="Latest Ex Date"><nobr>Latest Ex Date</nobr></th> 
                      <th class="last" style="width:18px;" title="Corporate Action">CA</th>

                    </tr>
                       <script>
                        document.write(tds);
                      </script><tr class="alt"><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=SBIN" target="_blank">SBIN</a></td><td class="number">266.60</td><td class="number">2.58</td><td class="number">1,60,04,173</td><td class="number">42,283.03</td><td class="number">260.80</td><td class="number">267.20</td><td class="number">258.80</td><td class="number">259.90</td><td>03-Jun-2016</td><td><img style="float:right" title="Dividend - Rs 2.60/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr><tr><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=BANKBARODA" target="_blank">BANKBARODA</a></td><td class="number">162.15</td><td class="number">2.30</td><td class="number">1,02,78,079</td><td class="number">16,528.18</td><td class="number">159.10</td><td class="number">162.75</td><td class="number">157.20</td><td class="number">158.50</td><td>16-Jun-2016</td><td><img style="float:right" title="Annual General Meeting" src="/live_market/resources/images/note_ico.gif"></td></tr><tr class="alt"><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=TATAPOWER" target="_blank">TATAPOWER</a></td><td class="number">77.10</td><td class="number">2.05</td><td class="number">95,03,932</td><td class="number">7,283.81</td><td class="number">75.70</td><td class="number">77.50</td><td class="number">75.15</td><td class="number">75.55</td><td>07-Sep-2016</td><td><img style="float:right" title="Dividend-Rs.1.30/- Per Share (Book Closure Dates Revised)" src="/live_market/resources/images/note_ico.gif"></td></tr><tr><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=ACC" target="_blank">ACC</a></td><td class="number">1,383.90</td><td class="number">1.86</td><td class="number">2,13,389</td><td class="number">2,927.93</td><td class="number">1,362.25</td><td class="number">1,385.00</td><td class="number">1,355.55</td><td class="number">1,358.65</td><td>02-Aug-2016</td><td><img style="float:right" title="Interim Dividend - Rs 11/- Per Share (Purpose Revised)" src="/live_market/resources/images/note_ico.gif"></td></tr><tr class="alt"><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=ICICIBANK" target="_blank">ICICIBANK</a></td><td class="number">267.60</td><td class="number">1.85</td><td class="number">1,59,39,339</td><td class="number">42,381.11</td><td class="number">263.95</td><td class="number">269.80</td><td class="number">263.00</td><td class="number">262.75</td><td>16-Jun-2016</td><td><img style="float:right" title="Annual General Meeting/ Dividend - Rs 5/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr><tr><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=ZEEL" target="_blank">ZEEL</a></td><td class="number">460.00</td><td class="number">1.58</td><td class="number">24,44,150</td><td class="number">11,244.31</td><td class="number">458.00</td><td class="number">464.00</td><td class="number">455.25</td><td class="number">452.85</td><td>21-Jul-2016</td><td><img style="float:right" title="Annual General Meeting/ Dividend -Rs 2.25/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr><tr class="alt"><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=HINDALCO" target="_blank">HINDALCO</a></td><td class="number">182.00</td><td class="number">1.56</td><td class="number">95,43,322</td><td class="number">17,349.76</td><td class="number">180.45</td><td class="number">183.45</td><td class="number">179.05</td><td class="number">179.20</td><td>06-Sep-2016</td><td><img style="float:right" title="Dividend - Re 1/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr><tr><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=TECHM" target="_blank">TECHM</a></td><td class="number">470.75</td><td class="number">1.51</td><td class="number">20,17,058</td><td class="number">9,513.45</td><td class="number">467.00</td><td class="number">474.95</td><td class="number">465.00</td><td class="number">463.75</td><td>28-Jul-2016</td><td><img style="float:right" title="Annual General Meeting/Final Dividend Rs 6/- Per Share And Special Dividend Rs 6/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr><tr class="alt"><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=AXISBANK" target="_blank">AXISBANK</a></td><td class="number">455.65</td><td class="number">1.40</td><td class="number">77,80,448</td><td class="number">35,172.29</td><td class="number">451.70</td><td class="number">458.00</td><td class="number">445.55</td><td class="number">449.35</td><td>07-Jul-2016</td><td><img style="float:right" title="Annual General Meeting/ Dividend-Rs 5/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr><tr><td><a href="/live_market/dynaContent/live_watch/get_quote/GetQuote.jsp?symbol=INDUSINDBK" target="_blank">INDUSINDBK</a></td><td class="number">1,113.20</td><td class="number">1.36</td><td class="number">10,68,419</td><td class="number">11,857.31</td><td class="number">1,105.00</td><td class="number">1,116.95</td><td class="number">1,092.35</td><td class="number">1,098.30</td><td>23-Jun-2016</td><td><img style="float:right" title="Annual General Meeting/ Dividend -Rs 4.50/- Per Share" src="/live_market/resources/images/note_ico.gif"></td></tr>
                  </tbody></table>

请帮助 致谢

2 个答案:

答案 0 :(得分:0)

快速浏览一下html源码,我发现这些行是对获取动态数据的脚本的引用:

<!-- Added To Get Top Gainers/Losers Data From JSON File SwapnilG -->
<script type="text/javascript" src="/live_market/resources/js/getGainersLosersData.js"></script>

查看此脚本时,您可以重建似乎为this JSON file的数据源。

有了这些信息,您应该能够获取数据(您可能需要JSON解析器来执行此操作)。

答案 1 :(得分:0)

此页面使用javascript获取数据,数据驻留在此json url:

https://www.nseindia.com/live_market/dynaContent/live_analysis/gainers/niftyGainers1.json

你可以在chrome的开发工具中找到这个网址,你只需要这个网址,并获得你需要的所有信息