Question

我无法找到并返回<b>标签中的值，我在阅读任何标签时都没有运气。

我不想发布一百行的观看源信息，我不知道如何正确发布链接，但如果您能够自己查看页面来源，这里就是网页{ {3}}

我想要检索的信息 http://yugiohprices.com/card_price?name=Dark+Magician

以下是我正在使用的代码

import requests
from bs4 import BeautifulSoup
r = requests.get('http://yugiohprices.com/card_price?name=Dark+Magician'); 
soup = BeautifulSoup(r.content, "lxml")
print soup.find('b').text

这是输出

主页 |前100名|浏览卡片|浏览集

购买统计数据 |关注列表|卡片价格

卖我的卡片|价格提醒|博客|常见问题|设置

无论我改变或尝试什么，我都无法访问＆＃34; LDK2-ENY10＆＃34;文本

Answer 1

您可以看到页面需要一段时间来加载数据，通过Ajax请求请求数据，因此请求返回的内容不是您在浏览器中看到的内容。您可以通过简单的get http://yugiohprices.com/get_card_prices/Dark+Magician模拟ajax请求，并传递时间戳：

import requests
from time import time

r = requests.get("http://yugiohprices.com/get_card_prices/Dark+Magician?_={}".format(int(time())))

print(r.content)

您将会看到有关该卡的所有详细信息，因此要获得您想要的内容，只需使用 / browse_sets开头的 href 找到 anchor ？设置：

In [1]: import requests
   ...: from time import time
   ...: from bs4 import BeautifulSoup
   ...: 
   ...: r = requests.get("http://yugiohprices.com/get_card_prices/Dark+Magician?
   ...: _={}".format(int(time())))
   ...: soup = BeautifulSoup(r.content, "lxml")
   ...: print(soup.select_one("a[href^=/browse_sets?set]").text)
   ...: 
Legendary Decks II

In [2]:

Python Beautifulsoup访问标签中的文字？

1 个答案: