Question

所以我正在做一个相对简单的项目，所以我可以自学Python。我已经陷入了困境。所以我在pycharm调试器中有一个名为元素的变量，显示为

这个变量是Tag类型，对我来说是正确的。在元素中，我想查看class="schedule_dgrd_time/result"是否与上图中的情况不同。

我看到元素中有一个attrs。

如何访问该值？如果我element.string我得到的文本值在这种情况下将是星期六。（...我可以做到这一点），但我想知道我是否可以检查类属性价值第一。

我一直在寻找这几天，但却无法得到它。此时我已经用谷歌搜索了自己。任何帮助或指针将不胜感激。谢谢阅读。

更新这是我的代码

import urllib2
import datetime
import re
from bs4 import BeautifulSoup

# today's date
date = datetime.datetime.today().strftime('%-m/%d/%Y')
validDay = "Mon\.|Tue\.|Wed\.|Thu(r)?(s)?\.|Fri\."
website = "http://www.texassports.com/schedule.aspx?path=baseball"

opener = urllib2.build_opener()
##add headers that make it look like I'm a browser
opener.addheaders = [('User-Agent', 'Mozilla/5.0')]
page = opener.open(website)
# turn page into html object
soup = BeautifulSoup(page, 'html.parser')
#print soup.prettify()

#get all home games
all_rows = soup.find_all('tr', class_='schedule_home_tr')

# see if any game is today
# entryForToday = [t for t in all_rows if t.findAll('nobr',text=re.compile('.*({}).*'.format(date)))]

# hard coding for testing weekend
entryForToday = [t for t in all_rows if t.findAll('nobr',text=re.compile('3/11/2017'))]

time = "schedule_dgrd_time/result"

for elements in entryForToday:
   for element in elements:
       #this is where I'm stuck. 
        # if element.attrs:
        #     print element.attrs['class'][0]

我知道双嵌套for循环并不理想，所以如果你有更好的方法，我很高兴听到它。感谢

Answer 1

所以我能弄清楚。我有一些没有attrs的NavigableString因此引发了错误。 element.attrs['class'][0]现在可以正常运作了。我必须检查isinstanceOf是否为标签，如果不是，它会跳过它。 Anywho，对于任何有兴趣的人，我的代码都在下面。

import urllib2
import datetime
import re
from bs4 import BeautifulSoup
from bs4 import Tag

# today's date
date = datetime.datetime.today().strftime('%-m/%d/%Y')
validDay = "Mon\.|Tue\.|Wed\.|Thu(r)?(s)?\.|Fri\."
website = "http://www.texassports.com/schedule.aspx?path=baseball"

opener = urllib2.build_opener()
##add headers that make it look like I'm a browser
opener.addheaders = [('User-Agent', 'Mozilla/5.0')]
page = opener.open(website)
# turn page into html object
soup = BeautifulSoup(page, 'html.parser')
#print soup.prettify()

#get all home games
all_rows = soup.find_all('tr', class_='schedule_home_tr')

# see if any game is today
# entryForToday = [t for t in all_rows if t.findAll('nobr',text=re.compile('.*({}).*'.format(date)))]

# hard coding for testing weekend
entryForToday = [t for t in all_rows if t.findAll('nobr',text=re.compile('3/14/2017'))]

classForTime = "schedule_dgrd_time/result"
timeOfGame = "none";

if entryForToday:
entryForToday = [t for t in entryForToday if t.findAll('td',
                                                        class_='schedule_dgrd_game_day_of_week',
                                                        text=re.compile('.*({}).*'.format(validDay)))]
if entryForToday:
    for elements in entryForToday:
        for element in elements:

            if isinstance(element, Tag):
                if element.attrs['class'][0] == classForTime:
                    timeOfGame = element.text
                # print element.text
                    break

print timeOfGame

如何从Python中的变量获取属性值

1 个答案: