BS4找不到文字

时间:2019-06-16 21:48:35

标签: python python-2.7 beautifulsoup

我正在尝试打印此文本https://i.imgur.com/SLl1URt.png 我使用了“ soup.find_all(” p“,class _ =” review“)”“,并尝试使用.getText或在.contents内进行检查,但没有一个起作用

网络链接https://m.wuxiaworld.co/Castle-of-Black-Iron/

这是一些调试信息https://i.imgur.com/0k6NHeD.png

Method that is being used


showOptions(product, event) {
    this.product_id.forEach(el => {
        if (product.id == el) {
            $(event.target.nextElementSibling).addClass("visible");
            this.$anime({
                targets: event.target.nextElementSibling,
                scaleX: 1,
                translateX: "0px",
                easing: 'easeInOutSine',
                width: '20%'
            })
        }
    });
},

2 个答案:

答案 0 :(得分:0)

打印汤时,您会在终端上看到一些html标签(不是全部来源)。我认为该网站隐藏了部分数据。因此,我建议使用Selenium。 如果尚未下载,则可以安装在:

https://chromedriver.storage.googleapis.com/index.html?path=2.35/

所有代码:

from selenium import webdriver

driver_path = r'your driver path'
browser = webdriver.Chrome(executable_path=driver_path)


browser.get("https://m.wuxiaworld.co/Castle-of-Black-Iron/")

x = browser.find_elements_by_css_selector("p[class='review']") ## Declare which class
for text1 in x:
    print text1.text
browser.close()

输出:

  

说明       灾难之后,世界上的每条规则都被改写。在黑铁时代,钢铁,铁,蒸汽机和战斗力   成为人类赖以生存的关键。一种   一个名叫张铁的平民男孩被诸神之神选中   并被赠予一棵小树,可以不断产生各种   奇妙的水果。同时,张铁被扔进了   战争的火焰,人类与人类之间的三年战争   空旷的大陆上的怪物。用水晶拍打   人体的潜力,必须培养才能变得更强大。   神秘部落的惊险传奇,东方秘密   地下世界的幻想,无数珍宝和遗产-   全部在黑铁城堡!黑铁之堡

答案 1 :(得分:0)

import requests
from bs4 import BeautifulSoup
from collections import OrderedDict

def info(novelname):        
    response = requests.get(
        'https://m.wuxiaworld.co/{}/'.format(novelname.replace(' ', '-')),
        headers=OrderedDict(
            (
                ("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7"),
                ("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),
                ("Accept-Language", "en-US,en;q=0.5"),
                ("Accept-Encoding", "gzip, deflate"),
                ("Connection", "keep-alive"), 
                ("Upgrade-Insecure-Requests", "1")
            )
        )
    )

    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html5lib')

        for textp in soup.find_all('p', attrs={'class': 'review'}):
            print textp.text.strip()

info('Castle of Black Iron')

问题是您的html解析器...使用html5lib给了我们

Description

After the Catastrophe, every rule in the world was rewritten.

In the Age of Black Iron, steel, iron, steam engines and fighting force became the crux in which human beings depended on to survive.

A commoner boy by the name Zhang Tie was selected by the gods of fortune and was gifted a small tree which could constantly produce various marvelous fruits. At the same time, Zhang Tie was thrown into the flames of war, a three-hundred-year war between the humans and monsters on the vacant continent. Using crystals to tap into the potentials of the human body, one must cultivate to become stronger.

The thrilling legends of mysterious clans, secrets of Oriental fantasies, numerous treasures and legacies in the underground world — All in the Castle of Black Iron!

Citadel of Black Iron
黑铁之堡