我正在尝试从div获取姓名和联系电话。 div有时只有一个跨度,有时两个,有时三个。我的期望是:
这是我到目前为止所拥有的:
// if you change url to url-1 and url-2 then you will see how it works.
url = "https://www.zillow.com/homedetails/19442-185th-Ave-SE-Renton-WA-
98058/54831221_zpid/"
#url-1 = "https://www.zillow.com/homedetails/20713-61st-St-E-Bonney-Lake-WA-98391/99371104_zpid/"
#url-2 = "https://www.zillow.com/homes/fsbo/house_type/121319389_zpid/globalrelevanceex_sort/47.465758,-122.259207,47.404798,-122.398424_rect/12_zm/5f9305c92cX1-CRbri51bo8epha_yly1g_crid/0_mmm/"
browser = webdriver.Firefox()
browser.get(url)
time.sleep(5)
soup = bs4.BeautifulSoup(browser.page_source,'html.parser')
contacts = browser.find_elements_by_css_selector("span.listing-field")
contact_name = []
contact_phone = "N/A"
contact_web = "N/A"
for i in range(0, len(contacts)):
if len(contacts[i].find_elements_by_tag_name("a")) > 0:
contact_web =
contacts[i].find_element_by_tag_name("a").get_attribute("href")
elif re.search("\\(\\d+\\)\\s+\\d+-\\d+", contacts[i].text):
contact_phone = contacts[i].text
else:
contact_name.append(contacts[i].text)
print(contact_phone) // Output: (253) 335-8690
print(contact_name) // Output: ['Sheetal Datta']
答案 0 :(得分:1)
欢迎使用StackOverflow!您应该以编程方式(即根据条件)解决此问题。正如您已经提到的,
if the name exists and the contact number exists,
use them
else if the name exists only,
use the name and assign the contact number as 'N/A'
else if the contact number exists only,
use the contact number and assign the name as 'N/A'
如您所见,您可以使用if-elif-else语句将上述伪代码实现为Python中的实际条件语句。根据网页的结构,在尝试从span
读取值之前,您需要先检查public function getallRIS(){
$connect = $this->connect;
/*
THIS QUERY WORKS WITHOUT MULTIPLE IDS ON THE COLUMN
$qx = "SELECT initiatives.location_id as init_location, initiatives.name as init_name, initiatives.startyear as init_startyear, initiatives.endyear as init_endyear, ST_AsText(locations.location) as locations_coord, locations.name as locations_name
FROM locations
JOIN initiatives on initiatives.location_id = locations.id";
*/
$qx = "SELECT GROUP_CONCAT(initiatives.location_id SEPARATOR ',') as init_location, initiatives.name as init_name, initiatives.startyear as init_startyear, initiatives.endyear as init_endyear, ST_AsText(locations.location) as locations_coord, locations.name as locations_name
FROM locations
JOIN initiatives
ON FIND_IN_SET(locations.id, initiatives.location_id)
GROUP BY initiatives.location_id";
$queryx = "SELECT ST_AsText(ri_location) FROM ri;";
if($query = $connect->query($qx)){ }else{ echo $connect->error; }
$count = $query->num_rows;
$row = 1;
while($fetch = $query->fetch_array(MYSQLI_ASSOC)){
$point = $fetch['locations_coord'];
$point = str_replace(array("POINT(",")"),array("",""),$point);
$point = str_replace(" ",",",$point);
$coord = explode(",",$point);
$lat = $coord[0];
$long = $coord[1];
echo'
{
lat: '.$lat.',
lng: '.$long.',
text: "'; echo "<b style='font-size:17px;'>"; echo''.$fetch['init_name'].'</b><br>Country: '.$fetch['locations_name'].'<br>Info: '; echo"<a href=''>"; echo'View</a><br>Start Year: '.$fetch['init_startyear'].'<br>End Year: '.$fetch['init_endyear'].'"
},
';
$row++;
}
}
是否存在,您可以按照此SO post进行操作。>
答案 1 :(得分:0)
您可以使用try: except:
检查联系人姓名和电话号码是否存在,然后相应地分配值。查看代码...
from bs4 import BeautifulSoup
from selenium import webdriver
import time
url = ('https://www.zillow.com/homedetails/19442-185th-Ave-SE-Renton-WA-'
'98058/54831221_zpid/')
browser = webdriver.Firefox()
browser.get(url)
time.sleep(5)
soup = BeautifulSoup(browser.page_source,'html.parser')
browser.quit()
tag = soup.find('div',attrs={
'class':'home-details-listing-provided-by zsg-content-section'})
try:
contact_name = tag.find('span',attrs={
'class':'listing-field'}).text
except:
contact_name = 'N/A'
try:
contact_phone = tag.find('span',attrs={
'class':'listing-field'}).findNext('span').text
except:
contact_phone = 'N/A'
print('Contact Name: {}\nContact Phone: {}'.format(
contact_name,contact_phone))
输出:
Contact Name: Sheetal Datta
Contact Phone: (253) 335-8690