<div class="island biz-owner-reply clearfix">
<div class="biz-owner-reply-header arrange arrange--6">
<div class="arrange_unit biz-owner-reply-photo">
<div class="photo-box pb-30s">
<a href="https://s3-media1.fl.yelpcdn.com/buphoto/QdBQ1FI9os4heZH9rFAV6Q/o.jpg">
<img alt="Beckie F." class="photo-box-img" height="30" src="https://s3-media4.fl.yelpcdn.com/buphoto/QdBQ1FI9os4heZH9rFAV6Q/30s.jpg" srcset="https://s3-media4.fl.yelpcdn.com/buphoto/QdBQ1FI9os4heZH9rFAV6Q/90s.jpg 3.00x,https://s3-media4.fl.yelpcdn.com/buphoto/QdBQ1FI9os4heZH9rFAV6Q/ss.jpg 1.33x" width="30">
</a>
</div>
</div>
<div class="arrange_unit arrange_unit--fill embossed-text-white">
<strong>
Comment from Beckie F. of Yard House
</strong>
<br>
Business Customer Service
</div>
</div>
<span class="bullet-after">4/4/2018</span>
Hi Kim. We are happy to be apart of the community. Thank you for the warm welcome!
<div class="review-footer clearfix"></div>
</div>
我正在尝试使用biz-owner-reply
和selenium
获取课程python
的价值。我首先找到该元素,然后尝试获取其值如下:
response = ""
responses = review_wrappers[0].find_elements_by_class_name("biz-owner-reply")
if len(responses) > 0:
response = responses[0].text
但是,结果还包含其子元素的值:
'response':'Comment from Beckie F. of Yard House\nBusiness Customer Service\n4/4/2018 Hi Kim. We are happy to be apart of the community. Thank you for the warm welcome!'
我怎样才能得到:
Hi Kim. We are happy to be apart of the community. Thank you for the warm welcome!
答案 0 :(得分:1)
因为selenium不能返回TextNode,只能返回ElementNode。我们需要javascript的帮助来使用HTML DOM API来存档您的目标。
script = """
return Array.from(arguments[0].childNodes)
.filter(function(node){return node.nodeType === 3;})
.map(function(node){return node.nodeValue;})
.join('');
"""
// childNodes get all child node of parent
// nodeType === 3, means it's a TextNode, like text inside html Tag
// nodeType === 1, means it's a ElementNode, like html tag
// nodetype === 2, means it's a AttributeNode, like attribute of html tag
ele = driver.find_element_by_css_selector("div.biz-owner-reply");
txt = driver.execute_script(script, ele)
的更多详情
的更多详情
答案 1 :(得分:0)
似乎有点不清楚。雍和我的想法一样。到目前为止,您只需回忆您的消息的核心文本,您的答案包括访问者的所有回复。
例如,如果你的sql中只有3个表:
id,date,text
并且您想要像实际执行的那样只提取文本...您将获得所有文本。
如果您只想提取评论,我估计您需要:
带有#core_message
的sql或xml文件answers = $ core_message
我需要更多信息,但这是仅仅调用单个元素而不是所有信息的想法......