Question

我已经解析了html页面：使用beautifulsoup

authors = soup.find_all("span", itemprop = 'author')
for author in authors:
    print(author)

然后我得到了作者：

<span content="Oliver" itemprop="author"></span>
<span content="Jack" itemprop="author"></span>

如何获取内容？

我尝试过：

for auther in authors:
    print(author.content)

但是我什么也没得到

Answer 1

要获取内容，您应该执行以下操作：

    public void Set_Number_Of_Students(int value)
    {
             number_Of_Students = value;
             studentName = new string[number_Of_Students];
             sfcGrade = new int[number_Of_Students]; 
             csGrade = new int[number_Of_Students];
             sdtGrade = new int[number_Of_Students];
             ddoocpGrade = new int[number_Of_Students];
    }

或者，您可以使用以下代码将所有作者存储在for auther in authors: print(author["content"])变量（作为列表）中：

all_authors

希望这会有所帮助！

Answer 2

您很亲密：

git rebase

Answer 3

如果不确定带有content的元素是否总是具有itemprop = author属性，则可以在选择器中使用AND语法指定在尝试访问之前必须同时具有两个属性：

authors = [i['content'] for i in soup.select('[itemprop=author][content]')]

通过Beautifulsoup获取span的content属性

3 个答案: