使用Selenium Webdriver和python,我能够找到搜索单元并搜索以返回结果,但是我想从返回的前10行(减去标题行)中打印结果。
我正在使用的网站是:http://www.hoovers.com/company-information/company-search.html?term=simon例如作为搜索词。
我已经搜索了一段时间,并尝试了很多东西,包括xpaths和大多数错误。这是我到目前为止最接近的内容:
for row in mydriver.find_elements_by_class_name('cmp-company-directory'):
cell = row.find_elements_by_tag_name("td")[0]
print(cell.text)
但是,它仅返回第一行,并且不会遍历表。有小费吗? TIA!
答案 0 :(得分:0)
在Xpath下尝试此操作,它将遍历表格并打印前10行。
elements=driver.find_elements_by_xpath("//div[@class='clear data-table sortable-header dashed-table-tr alternate-rows']//tr/td")
counter=1
for element in elements:
print(element.text)
counter+=1
if counter==50:
break
输出:
Simon Property Group, Inc.
Indianapolis, IN, United States
$5538.64M
See Details
SIMON & SCHUSTER (UK) LIMITED
London, London, England
$60.39M
See Details
SIMON JERSEY GROUP LIMITED
Accrington, Lancashire, England
See Details
Simon Worldwide, Inc.
Irvine, CA, United States
$0.0M
See Details
Simon Property Group, L.P.
Indianapolis, IN, United States
$5538.64M
See Details
Günter Simon e.K. Inh. Carmen Simon
Ravensburg, Baden-Württemberg, Germany
See Details
Simon e Simon Servicos Odontologicos Ltda
Vere, Parana, Brazil
See Details
Simon Comercial e Industrial Ltda Em Recuperacao Judicial
Aparecida De Goiania, Goias, Brazil
See Details
Simon Levelt B.V.
Haarlem, Noord-Holland, The Netherlands
See Details
SIMON SAU
Barcelona, Barcelona, Spain
$115.95M
See Details
如果只想打印公司名称的前10行,请尝试此操作。
elements=driver.find_elements_by_xpath("//div[@class='clear data-table sortable-header dashed-table-tr alternate-rows']//tr/td[@class='company_name']")
counter=0
for element in elements:
print(element.text)
counter+=1
if counter==10:
break
输出:-
Simon Property Group, Inc.
SIMON & SCHUSTER (UK) LIMITED
SIMON JERSEY GROUP LIMITED
Simon Worldwide, Inc.
Simon Property Group, L.P.
Günter Simon e.K. Inh. Carmen Simon
Simon e Simon Servicos Odontologicos Ltda
Simon Comercial e Industrial Ltda Em Recuperacao Judicial
Simon Levelt B.V.
让我知道这是否对您有用。
答案 1 :(得分:0)
要打印公司名称(不包括标题行),您必须为flag.StringVar(&cmdSt.configPtr, "c", "configfile", "configure file to parse ")
flag.StringVar(&cmdSt.interfacePtr, "i", "interface", "capture network interface")
flag.Parse()
// cmdSt.configPtr and cmdSt.interfacePtr are now set to
// command flag value or default if the flag was
// not specified.
引入 WebDriverWait ,并且可以使用以下解决方案之一:>
visibility_of_all_elements_located
:
CSS_SELECTOR
print([company_name.get_attribute("innerHTML") for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.cmp-company-directory table td.company_name>a")))])
:
XPATH
要打印前10个公司名称(不包括标题行),您必须为print([company_name.get_attribute("innerHTML") for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='cmp-company-directory']//table//td[@class='company_name']/a")))])
引入 WebDriverWait ,然后必须使用 { {1}} 将列表限制为 10 个元素,您可以使用以下任一解决方案:
visibility_of_all_elements_located
:
[:10]
CSS_SELECTOR
:
print([company_name.text for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.cmp-company-directory table td.company_name>a")))[:10]])
注意:您必须添加以下导入:
XPATH