使用Scrapy中的字体或颜色刮擦网站

时间:2019-06-07 15:39:56

标签: python scrapy splash-screen scrapy-splash

我需要从网站上剔除价格,然后遇到一个问题,其中某些价格被划掉,新价格以红色/粗体显示,并且该代码的html代码不同,所以我的代码为空价钱。因此,我决定执行if语句以获取正确的数据,但是唯一的问题是,划掉的价格具有相同的标识符,因此我获得了该价格,而不是红色的价格。那么 Scrapy 中有没有一种方法可以根据颜色为红色或字体为粗体来刮我需要的价格?如果没有,我还有另一种方法来获得合适的价格吗?

Partial HTML code for price, need the second price, 13.49

 for game in response.css("tr[class^=deckdbbody]"):

            # Initialize saved_name to the extracted card name
            saved_name  = game.css("a.card_popup::text").extract_first() or saved_name
            # Now call item and set equal to saved_name and strip leading '\n' from output
            item["Card_Name"] = saved_name.strip()
            # Check to see if output is null, in the case that there are two different conditions for one card
            if item["Card_Name"] != None:
                # If not null than store value in saved_name
                saved_name = item["Card_Name"].strip()
            # If null then set null value to previous card name since if there is a null value you should have the same card name twice
            else:
                item["Card_Name"] = saved_name
            # Call item again in order to extract the condition, stock, and price using the corresponding html code from the website
            item["Condition"] = game.css("td[class^=deckdbbody].search_results_7 a::text").get()
            item["Stock"] = game.css("td[class^=deckdbbody].search_results_8::text").extract_first()
            item["Price"] = game.css("td[class^=deckdbbody].search_results_9::text").extract_first()
            if item["Price"] == None:
                item["Price"] = game.css("td[class^=deckdbbody].search_results_9 span::text").get()

            # Return values
            yield item

2 个答案:

答案 0 :(得分:1)

您可以使用样式属性对其进行过滤

response.css('span[style^="color:red;"]::text').get()

答案 1 :(得分:1)

您需要调整表情:

if item["Price"] == None:
    item["Price"] = game.css("td[class^=deckdbbody].search_results_9 span[style*='color:red']::text").get()