Question

我想从html数据中获取特定标签之间的数据。

<ul>    
    <li>
        More consistent tension control and approximation with each pass than with traditional sutures.
        <ul>                    
            <li>Unique anchor designs provide multiple points of fixation along the device, allowing tension on the device to be maintained during closure.<sup><a class="reference_link" href="#22">[22]</a></sup></li>
            <li>Compared to traditional sutures, STRATAFIX™ Devices enable surgeons to easily manage tension and control approximation with each pass.<sup><a class="reference_link" href="#3">[3]</a></sup></li>
        </ul>
    </li>
<ul>

在这里，我希望从<a class="reference_link" href="#3">[3]</a>获取数据，我想存储该值（例如.3）。

先谢谢。

Answer 1

看起来互联网上有关于如何在iOS上解析HTML的相关来源;例如http://www.raywenderlich.com/14172/how-to-parse-html-on-ios：

[...]有一个方便的小库，包含在iOS SDK中，名为libxml2。

据我所知，这篇文章似乎有关于如何准确实现所需内容的代码示例。

Answer 2

你可以使用python来解析使用Beautiful Soup模块的html页面。

以下是指向它的链接 - http://www.crummy.com/software/BeautifulSoup/

这里有一些你可以遵循的示例代码。 http://www.pythonforbeginners.com/python-on-the-web/beautifulsoup-4-python/

Answer 3

如果您使用JQuery，它可能对您有用..

 var items = $('#listTable li sup');

这里listTable是listview id。

Answer 4

尝试美丽的汤这是代码

import urllib2
from bs4 import BeautifulSoup
response = urllib2.urlopen('http://www.crummy.com/software/BeautifulSoup/bs4/doc/')
html = response.read()
soup = BeautifulSoup(html_doc)
for link in soup.find_all('a'):
    link1 = link.get('href') 
    print link1

如果您使用python作为编码语言，则会出现这种情况。您将获得文档中的所有链接。这是beatifulsoup文档的链接：

http://www.crummy.com/software/BeautifulSoup/bs4/doc/

在html中获取特定标签之间的数据

4 个答案: