我正在尝试从HTML表中提取数据。 成功计数了行,但是当我打印时,它会不断重复行。 谁能告诉我代码有什么问题吗? 谢谢。
#counting length of row
rows = len(driver.find_elements_by_xpath('/html/body/form/fieldset/table[2]/tbody/tr/td[3]/table/tbody/tr[5]/td[2]/div/table[1]/tbody/tr[2]/td[1]/table[2]/tbody/tr'))
time.sleep(2)
print(rows)
for r in range(rows):
value=driver.find_element_by_xpath('/html/body/form/fieldset/table[2]/tbody/tr/td[3]/table/tbody/tr[5]/td[2]/div/table[1]/tbody/tr[2]/td[1]/table[2]/tbody/tr["+str(r)+"]')
print(value.text)
#Output:
18 #no of rows
Start of legal relation2/7/2018 #1st row
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
Start of legal relation2/7/2018
sample test case successfully completed
答案 0 :(得分:0)
没有提供的URL,很难说出原因。但是,第一个tr
元素应该是[1]
,所以我认为您的range
函数应该是range(1, rows + 1)
。而且您执行此操作的方式似乎非常间接,因为您的第一个查询似乎已检索到所有要查找的元素。那为什么不只是以下内容?
elements = driver.find_elements_by_xpath('/html/body/form/fieldset/table[2]/tbody/tr/td[3]/table/tbody/tr[5]/td[2]/div/table[1]/tbody/tr[2]/td[1]/table[2]/tbody/tr')
#time.sleep(2) # what does this accomplish?
print(len(elements))
text_list = [element.text for element in elements] # list of strings