Question

我有以下内容：

int helper(int[][] original, int i, int j) {
    if ( i < 0 || j < 0 || i >= 4 || j >= 4 ) return 0;
    if ( original[i][j] == 1 ) return 0;
    if ( i == 3 && j == 3 ) return 1;

    int[] table = new int[4][];
    System.arraycopy(original, 0, table, 0, 4);
    table[i] = new int[4];
    System.arraycopy(original[i], 0, table[i], 0, 4);

    table[i][j] = 1;

    return helper(table, i, j+1) + helper(table, i+1, j) + helper(table, i, j-1) + helper(table, i-1, j);
}

并希望得到html = '''<div class=“file-one”> <a href=“/file-one/additional” class=“file-link"> <h3 class=“file-name”>File One</h3> </a> <div class=“location”> Down </div> </div>''' href的文字。所以我做了：

/file-one/additional

但它只打印一个空白，没有。只需from bs4 import BeautifulSoup soup = BeautifulSoup(html, 'html.parser') link_text = “” for a in soup.find_all(‘a’, href=True, text=True): link_text = a[‘href’] print “Link: “ + link_text。所以我在另一个网站上测试了它，但是使用了不同的HTML，并且它有效。

我可能做错了什么？或者是否有可能该网站故意编程为不返回Link:？

提前感谢您，一定会upvote /接受答案！

Answer 1

首先，使用不使用弯引号的其他文本编辑器。
其次，从text=True

soup.find_all

Answer 2

您还可以使用attrs通过正则表达式搜索获取href标记

<tbody>
  { numrows.map(item => <ObjectRow key={item.uniqueField} />) }
</tbody>

Answer 3

您可以使用几行gazpacho来解决此问题：


from gazpacho import Soup

html = """\
<div class="file-one">
    <a href="/file-one/additional" class="file-link">
      <h3 class="file-name">File One</h3>
    </a>
    <div class="location">
      Down
    </div>
  </div>
"""

soup = Soup(html)
soup.find("a", {"class": "file-link"}).attrs['href']

哪个会输出：

'/file-one/additional'

Python + BeautifulSoup：如何获得'a'元素的'href'属性？

3 个答案: