我正在尝试使用Python中的Selenium从以下代码访问> 。 <之间的数据。
<tbody>
<tr>
<td>
<div class="answer-votes" title="Asked 8 non-wiki questions with a total score of 164. Gave 84 non-wiki answers with a total score of 337." onclick="window.location.href='/search?q=user:37181+[python]'">337</div>
<a href="/search?q=user:37181+[python]" class="post-tag" title="">python</a>
<span class="item-multiplier" title="93 posts in the python tag"><span class="item-multiplier-x">×</span> <span class="item-multiplier-count">93</span></span></td>
<td>
<div class="answer-votes" title=" Gave 4 non-wiki answers with a total score of 22." onclick="window.location.href='/search?q=user:37181+[django-templates]'">22</div>
<a href="/search?q=user:37181+[django-templates]" class="post-tag" title="">django-templates</a>
<span class="item-multiplier" title="4 posts in the django-templates tag"><span class="item-multiplier-x">×</span> <span class="item-multiplier-count">4</span></span></td>
<td>
<div class="answer-votes" title=" Gave 1 non-wiki answer with a total score of 12." onclick="window.location.href='/search?q=user:37181+[slug]'">12</div>
<a href="/search?q=user:37181+[slug]" class="post-tag" title="">slug</a>
</td>
<td>
<div class="answer-votes" title=" Gave 1 non-wiki answer with a total score of 8." onclick="window.location.href='/search?q=user:37181+[google-app-engine]'">8</div>
<a href="/search?q=user:37181+[google-app-engine]" class="post-tag" title=""><img src="//i.stack.imgur.com/vobok.png" height="16" width="18" alt="" class="sponsor-tag-img">google-app-engine</a>
</td>
</tr>
<tr>
<td>
<div class="answer-votes" title="Asked 1 non-wiki question with a total score of 89. Gave 56 non-wiki answers with a total score of 235." onclick="window.location.href='/search?q=user:37181+[django]'">235</div>
<a href="/search?q=user:37181+[django]" class="post-tag" title="">django</a>
<span class="item-multiplier" title="57 posts in the django tag"><span class="item-multiplier-x">×</span> <span class="item-multiplier-count">57</span></span></td>
<td>
<div class="answer-votes" title="Asked 1 non-wiki question with a total score of 21. Gave 1 non-wiki answer with a total score of 22." onclick="window.location.href='/search?q=user:37181+[clang]'">22</div>
<a href="/search?q=user:37181+[clang]" class="post-tag" title="">clang</a>
<span class="item-multiplier" title="2 posts in the clang tag"><span class="item-multiplier-x">×</span> <span class="item-multiplier-count">2</span></span></td>
<td>
<div class="answer-votes" title=" Gave 1 non-wiki answer with a total score of 12." onclick="window.location.href='/search?q=user:37181+[connect]'">12</div>
<a href="/search?q=user:37181+[connect]" class="post-tag" title="show all posts by this user in 'connect'">connect</a>
</td>
<td>
<div class="answer-votes" title=" Gave 1 non-wiki answer with a total score of 8." onclick="window.location.href='/search?q=user:37181+[memcached]'">8</div>
<a href="/search?q=user:37181+[memcached]" class="post-tag" title="">memcached</a>
</td>
</tr>
</tbody>
但是,当编译器移至下一个<td>
时,我的程序未显示<td>
的更新值。您能否指导我如何解决此问题?这是我的代码:
driver.get("https://stackoverflow.com/users/37181/alex-gaynor?tab=tags")
SMRTable = driver.find_elements_by_xpath("//*[@class='user-tags'] //td")
for i in SMRTable:
print(i.get_attribute('innerHTML'))
print(i.find_element_by_xpath("//div[@class='answer-votes']").get_attribute('innerHTML'))
print(i.find_element_by_xpath("//*[@class='post-tag']").get_attribute('innerHTML'))
print(i.find_element_by_xpath("//span[@class='item-multiplier-count']").get_attribute('innerHTML'))
print('\n')
答案 0 :(得分:2)
如果要处理td
中的每个table
,则需要在每个XPath表达式的开头指定点(上下文字符),例如替换
print(i.find_element_by_xpath("//div[@class='answer-votes']").get_attribute('innerHTML'))
与
print(i.find_element_by_xpath(".//div[@class='answer-votes']").get_attribute('innerHTML'))
否则,您将在每次迭代中获得相同的值(仅来自第一个td
的值)
还请注意,您不应使用get_attribute('innerHTML')
来获取节点的文本内容,而应使用text
属性:
print(i.find_element_by_xpath(".//div[@class='answer-votes']").text)
答案 1 :(得分:1)
您的代码尝试几乎是完美的。您还需要注意一些其他事项:
find_elements_by_xpath()
使用 SMRTable 时,请添加 tagName ,即table
。.
设置引用( dot 运算符)。//div[@class='answer-votes']
是直接子标记,因此将其更改为./div[@class='answer-votes']
//*[@class='post-tag']
始终位于<a>
标记内,因此您需要使用.//a[@class='post-tag']
您的有效代码将是:
driver.get("https://stackoverflow.com/users/37181/alex-gaynor?tab=tags")
SMRTable = driver.find_elements_by_xpath("//table[@class='user-tags']//tr/td")
for i in SMRTable:
print(i.find_element_by_xpath("./div[@class='answer-votes']").get_attribute('innerHTML'))
print(i.find_element_by_xpath(".//a[@class='post-tag']").get_attribute('innerHTML'))
print(i.find_element_by_xpath(".//span[@class='item-multiplier-count']").get_attribute('innerHTML'))
控制台输出:
337
python
93
22
django-templates
4