Question

from bs4 import BeautifulSoup #imports beautifulSoup package
import urllib2

url2 = 'http://www.waldenu.edu/doctoral/phd-in-management/faculty'
page2 = urllib2.urlopen(url2)
soup2 = BeautifulSoup(page2.read(), "lxml")

row2 = soup2.findAll('p')
row2 = row2[18:-4] 

names2 = []
arrayNameLength = len(row2)
for x in names2:
    current2 = row2[x]
    currentString2 = current2.findAll('strong')
    if len(currentString2) > 0:
        currentString2 = currentString2[0]
        names2.append(currentString2.text)

这是我的代码，基本上我正试图从上面的网站上删除教员姓名。

我想我无法从所有名单列表的强标签中抓取名称。

Answer 1

你正在for x in names2:，而你的names2是空白的，所以你可能想要for x in row2:？

然后在你的循环体中你可以使用x作为content2，因为x不是索引它是元素本身

currentString2 = x.findAll('strong')
if len(currentString2) > 0:
    currentString2 = currentString2[0]
    names2.append(currentString2.text)

Web解析Python - 尝试在“强”标签之间获取教师姓名

1 个答案: