Python-BeautifulSoup4“无”返回?

时间:2018-09-23 20:32:41

标签: python beautifulsoup screen-scraping

我想获取.text,但是它根本不起作用。

这是我的代码:

from bs4 import BeautifulSoup
import request

generatedLink = "MyLink"
page = requests.get(generatedLink)
contents = page.text
soup = BeautifulSoup(contents, "html.parser")
name = soup.find('a',class_=["yt-simple-endpoint", "style-scope", "ytd-video-renderer"])

print(name)

它返回“无”

<a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" aria-label="TURNIR 1 VS 1 U LOLU FINALEE!! od korisnika KaLuu Vrijeme streaminga: prije 3 dana 3 sata i 49 minuta 644 pregleda" href="/watch?v=5N4X4hjkzOw" title="TURNIR 1 VS 1 U LOLU FINALEE!!">

TURNIR 1 VS 1 U LOLU FINALEE!!

</a>

我需要从标题中提取文本!

这里有些错误,但是我在代码中找不到该错误。有人能帮我吗 ?

1 个答案:

答案 0 :(得分:0)

尽管网页中显示的元素以此显示,

<a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" aria-label="VRACAMO SE MNOGO JACIII!!! by KaLuu Streamed 3 days ago 2 hours, 2 minutes 291 views" href="/watch?v=QPNienUChDg" title="VRACAMO SE MNOGO JACIII!!!">
                VRACAMO SE MNOGO JACIII!!!
              </a>

with请求将HTML元素更改为此。

    <a aria-describedby="description-id-904842" class="yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2" data-sessionlink="ei=6T6oW7LkMcKKowP9zqLoBQ&amp;feature=c4-overview&amp;ved=CDYQ-SUYACITCPL87M-_0t0CFULFaAodfacIXSibHA" dir="ltr" href="/watch?v=QPNienUChDg" rel="nofollow" title="VRACAMO SE MNOGO JACIII!!!">
VRACAMO SE MNOGO JACIII!!!
</a>

通过观察,您可以看到类值从yt-simple-endpoint style-scope ytd-video-rendereryt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2的变化。在某些基于地理位置和客户端原因的网站中,会发生这种情况。

识别出此内容后,我通过以下代码获得了值。

import requests
from bs4 import BeautifulSoup
url = 'https://www.youtube.com/channel/UCtBGKF3uQNybKeelFz4PolA'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
print(soup)
name = soup.find('a', {'class': 'yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2'})

print(name.text)

希望这会有所帮助!干杯!