检查标签是否存在,如果存在,则添加列表;如果不存在,则添加列表

时间:2018-06-20 21:40:22

标签: python-3.x list beautifulsoup

难以解决

尝试编写代码(如果存在于game_containers中)game_rating,然后编写game_rating(附加列表等级),否则编写“ na”(附加列表中带有“ na”)

我正在尝试获取游戏名称和等级,以与之匹配:

Headers = ["Game Name:", "Metascore", "Userscore:", "Release Data:", "Publisher:", "Rating:", 'Genre:']
names = []
metascores = []
userscores = []
release_dates = []
release_datesNew = []
publishers = []
ratings = []
ratingsNew = []
genres = []
genresNew = []

for element in i:

    url = "http://www.metacritic.com/browse/games/score/metascore/year/pc/filtered?view=detailed&sort=desc&year_selected=" + format(year_number)

    print(url)

    year_number -= 1

    # not sure about this but it works (I was getting blocked by something and this the way I found around it)
    req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})

    web_byte = urlopen(req).read()

    webpage = web_byte.decode('utf-8')

    #this grabs the all the text from the page
    html_soup = BeautifulSoup(webpage, 'html5lib')

    #this is for selecting all the games in from 1 to 100 (the list of them)
    game_containers = html_soup.find_all("div", class_="wrap product_wrap")
    game_names = html_soup.find_all("div", class_="main_stats")
    game_metas = html_soup.find_all("a", class_="basic_stat product_score")  
    game_users = html_soup.find_all("li", class_='stat product_avguserscore')
    game_releases = html_soup.find_all("ul", class_='more_stats')
    game_publishers = html_soup.find_all("li", class_='stat publisher')
    game_ratings = html_soup.find("li", class_='stat maturity_rating')
    game_genres = html_soup.find_all("li", class_='stat genre')

    container_number = 0

    for containers in game_containers:
        if containers.find(containers.game_names) is not None:
            names.append(game_names[container_number].text.strip())
        else:
            names.append("na")
        try:
            if containers.find(game_ratings) is not None:
                ratings.append(game_ratings[container_number].text.strip())
            else:
                ratings.append("na")
        except:
            ratings.append("na")

        container_number += 1  

for x in ratings:
    temp = str(x)
    temp2 = temp.replace("\n                        Rating:\n                        ", "")
    temp3 = temp2.replace("\n                    ", "")
    ratingsNew.append(temp3)

我所做的就是找到游戏的“容器”(即(“ div”,class _ =“ wrap product_wrap”))

但无法弄清楚不存在时如何跳过该容器(给列表一个“ na”)

有人可以指出我正确的方向吗?

谢谢。

0 个答案:

没有答案
相关问题