Question

对此有很多麻烦...... Python的新手很抱歉如果我不知道正确的搜索条件来自己查找信息。我甚至不肯认为是因为JS，但这是我最好的想法。

以下是我正在解析的HTML部分：

...
<div class="promotion">
    <div class="address">
        <a href="javascript:PropDetail2('57795471:MRMLS')" title="View property detail for 5203 Alhama Drive">5203 Alhama Drive</a>
    </div>
</div>
...

...和我正在使用的Python（这个版本是我最接近成功的版本）：

homeFinderSoup = BeautifulSoup(open("homeFinderHTML.html"), "html5lib")
addressClass = homeFinderSoup.find_all('div', 'address')
for row in addressClass:
    print row.get('href')

...返回

None
None
None

Answer 1

# Create soup from the html. (Here I am assuming that you have already read the file into
# the variable "html" as a string).
soup = BeautifulSoup(html) 
# Find all divs with class="address"
address_class = soup.find_all('div', {"class": "address"})
# Loop over the results
for row in address_class:
  # Each result has one <a> tag, and we need to get the href property from it.
  print row.find('a').get('href')

使用Python解析JavaScript href

1 个答案: