如何抓取这个网站我回答的主题

时间:2014-07-01 20:45:20

标签: python selenium web-scraping

问题

如何修改我的脚本以成功显示我按主题所做的答案数量。

代码

这是我尝试的脚本

import time

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

def get_topics('Juan-Gallardo'):
url = "http://www.quora.com/" + 'Juan-Gallardo' + "/topics"
browser = webdriver.Chrome()
browser.get(url)
time.sleep(2)
bod = browser.find_element_by_tag_name("body")

no_of_pagedowns = 40
while no_of_pagedowns:
bod.send_keys(Keys.PAGE_DOWN)
time.sleep(0.3)
no_of_pagedowns-=1

topics = [t.text.encode('ascii', 'replace') for t in browser.find_elements_by_class_name("name_text")]
counts = [c.text.encode('ascii', 'replace').split(' ')[0] for c in browser.find_elements_by_class_name("name_meta")]

li = [[topics[i], int(counts[i])] for i in xrange(len(topics)) if counts[i] != '']

browser.quit()

return li

错误

enter image description here

1 个答案:

答案 0 :(得分:0)

您需要为get_topics()函数定义一个参数:

def get_topics(user):
    url = "http://www.quora.com/" + user + "/topics"
    ...

然后,以这种方式调用函数:

get_topics('Juan-Gallardo')