我一直试图用BS抓这个网页,但无济于事。谁能帮助我?我不确定这个网页有什么问题,或者我的代码有问题。
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup
my_url = "https://www.cea.gov.sg/Custom/CEA/PublicRegister/Page/PublicRegisterDetail.aspx?UserId=ae0cdf1d-a30c-4c8c-9f80-b2cec17b4bd9"
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = Soup(page_html, "html.parser")
nameList2 = page_soup.findAll("span")
print (nameList2.string[1])
答案 0 :(得分:0)
你可以这样试试。我没有发现任何问题。
itemTemplate: function(value) {
return $("<audio controls>").css({
display: "inline-block",
width: "50px",
height: "22px"
});
},
结果:
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.cea.gov.sg/Custom/CEA/PublicRegister/Page/PublicRegisterDetail.aspx?UserId=ae0cdf1d-a30c-4c8c-9f80-b2cec17b4bd9")
soup = BeautifulSoup(response.text,"html.parser")
for item in soup.select(".form-wrap"):
Name = item.select_one("#FtPublicRegisterDetail_LblName").get_text()
Agent_Name = item.select_one("#FtPublicRegisterDetail_LblEstAgentName").get_text()
print(Name, Agent_Name)
如果您愿意,只使用“span”:
A R N MADANAGOPALAN (MADAN) PROPNEX REALTY PTE LTD
结果:
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.cea.gov.sg/Custom/CEA/PublicRegister/Page/PublicRegisterDetail.aspx?UserId=ae0cdf1d-a30c-4c8c-9f80-b2cec17b4bd9")
soup = BeautifulSoup(response.text,"html.parser")
doc_list = soup.select("span")
for item in range(len(doc_list)):
names = soup.select("span")[item].text
print(names)