我想获取.text,但是它根本不起作用。
这是我的代码:
from bs4 import BeautifulSoup
import request
generatedLink = "MyLink"
page = requests.get(generatedLink)
contents = page.text
soup = BeautifulSoup(contents, "html.parser")
name = soup.find('a',class_=["yt-simple-endpoint", "style-scope", "ytd-video-renderer"])
print(name)
它返回“无”
<a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" aria-label="TURNIR 1 VS 1 U LOLU FINALEE!! od korisnika KaLuu Vrijeme streaminga: prije 3 dana 3 sata i 49 minuta 644 pregleda" href="/watch?v=5N4X4hjkzOw" title="TURNIR 1 VS 1 U LOLU FINALEE!!">
TURNIR 1 VS 1 U LOLU FINALEE!!
</a>
我需要从标题中提取文本!
这里有些错误,但是我在代码中找不到该错误。有人能帮我吗 ?
答案 0 :(得分:0)
尽管网页中显示的元素以此显示,
<a id="video-title" class="yt-simple-endpoint style-scope ytd-video-renderer" aria-label="VRACAMO SE MNOGO JACIII!!! by KaLuu Streamed 3 days ago 2 hours, 2 minutes 291 views" href="/watch?v=QPNienUChDg" title="VRACAMO SE MNOGO JACIII!!!">
VRACAMO SE MNOGO JACIII!!!
</a>
with请求将HTML元素更改为此。
<a aria-describedby="description-id-904842" class="yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2" data-sessionlink="ei=6T6oW7LkMcKKowP9zqLoBQ&feature=c4-overview&ved=CDYQ-SUYACITCPL87M-_0t0CFULFaAodfacIXSibHA" dir="ltr" href="/watch?v=QPNienUChDg" rel="nofollow" title="VRACAMO SE MNOGO JACIII!!!">
VRACAMO SE MNOGO JACIII!!!
</a>
通过观察,您可以看到类值从yt-simple-endpoint style-scope ytd-video-renderer
到yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2
的变化。在某些基于地理位置和客户端原因的网站中,会发生这种情况。
识别出此内容后,我通过以下代码获得了值。
import requests
from bs4 import BeautifulSoup
url = 'https://www.youtube.com/channel/UCtBGKF3uQNybKeelFz4PolA'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
print(soup)
name = soup.find('a', {'class': 'yt-uix-sessionlink yt-uix-tile-link spf-link yt-ui-ellipsis yt-ui-ellipsis-2'})
print(name.text)
希望这会有所帮助!干杯!