我想从搜索框中获取输出的URL链接。 例如,如果我在https://www.realestate.co.nz/profile/的搜索框中输入“ 4/271 Balmoral Road”,则会显示相关结果,并导致我转到https://www.realestate.co.nz/profile/0b27093b9ce641108f7a6033b9fdae28
因此在R中,如果我将“ 4/271 Balmoral Road”作为输入,我希望输出为“ https://www.realestate.co.nz/profile/0b27093b9ce641108f7a6033b9fdae28”
能帮我吗?那将不胜感激。
我在下面的代码中使用了Rvest,但没有用
'https://www.realestate.co.nz/profile?query=4/271%20Balmoral%20Road' %>%
read_html() %>%
html_nodes(xpath = '//*[@id="ember386"]/div[1]/div/a') %>% html_attr('href')
答案 0 :(得分:1)
内容是动态检索的。您可以使用httr将地址查询发送到服务器,并使用jsonlite处理来自服务器的json响应。您会在响应中获得“ slugs”作为URL,您需要将其与基本字符串连接起来作为最终URL。
R:
library(httr)
library(jsonlite)
params = list('q' = '4/271 Balmoral Road')
d <- jsonlite::parse_json(httr::GET(url = 'https://platform.realestate.co.nz/search/v1/suggest/property', query = params))
base <- 'https://www.realestate.co.nz/profile/'
print(paste0(base, d$data[[1]]$slug))
或使用的OP版本:
library(httr)
library(jsonlite)
params = list('q' = '4/271 Balmoral Road')
get <- GET(url = 'https://platform.realestate.co.nz/search/v1/suggest/property', query = params)
json <- fromJSON(paste(get, collapse=""))
base <- 'https://www.realestate.co.nz/profile/'
print(paste0(base, json$data[[1]]$slug))
Py:
import requests
params = (('q', '4/271 Balmoral Road'),)
r = requests.get('https://platform.realestate.co.nz/search/v1/suggest/property' , params=params).json()
links = [f"https://www.realestate.co.nz/profile/{i['slug']}" for i in r['data']]
print(links[0])