Selenium,如何在两个div标签之间提取文本

时间:2013-03-28 06:21:01

标签: java selenium web automation

我是使用Selenium在网站上执行网络自动化的新手,我在两个div标签之间提取文本时遇到了麻烦。

以下是我尝试从中提取文本的HTML代码的片段。

 ...
<tr>
    <td width="150">
    <a href="https://rads.stackoverflow.com/amzn/click/com/B0099RGRT8" rel="nofollow noreferrer">
    <img height="90" border="0" width="90" alt="iOttie Easy Flex2 Windshield Dashboard Car Mount H&hellip by iOttie" src="http://ecx.images-amazon.com/images/I/51mf6Ry9J2L._SL500_SS90_.jpg">
    </a>
    <div class="xxsmall" style="margin-top: 5px">
        <a href="https://rads.stackoverflow.com/amzn/click/com/B0099RGRT8" rel="nofollow noreferrer">iOttie Easy Flex2 Windshield Dashboard Car Mount Holder Desk Stand for iPhone 5 4S 4 3GS Samsung Gal&amp;hellip</a>
        by iOttie
    </div>
    </td>
    <td style="padding-left: 10px;">
        <div>
            <div>
                <span style="margin-left:-5px; vertical-align: -1">

                </span>
                <b>
                <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_title_1?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Bought for my wife, now I want one. Excellent Product.</a>
                </b>
                ,
                <span class="nowrap">November 30, 2012</span>
            </div>
            <div style="margin-top: 5px;">
                I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving.
                <br>
                <br>
                So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones.
                <br>
                <br>
                The phone is very easy to insert and remove , even while driving.
                <br>
                The mount is easy to position but not loose enough that it doesn't hold the position you want.
                <br>
                <br>
                I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point…
                <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_more?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Read more</a>
            </div>
        </div>
    </td>
</tr>
...

其他div标签实际上也包含其他文本。

我想从中提取的是: 我为我的妻子买了这个坐骑,她的反馈是,它非常好用,即使在驾驶时也很容易使用。

            I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving.

            So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones.

            The phone is very easy to insert and remove , even while driving.

            The mount is easy to position but not loose enough that it doesn't hold the position you want.

            I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point…

这是我的代码:

String review;
try {
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText();
} catch (NoSuchElementException nsee) {
    review = "NA";
}

这实际上从所有内部div标签中提取了所有不是我想要的文本。我可以使用./td/div/div[3]定位特定div标签,但我无法在div标签之间获取文本。

有什么想法吗?

由于

2 个答案:

答案 0 :(得分:1)

您可以使用常规表达式作为解决方法:

String review;
try {
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText();
    review.replaceAll("(<.+>)", "");
} catch (NoSuchElementException nsee) {
    review = "NA";
}

正则表达式删除所有标签和内部元素文本。只剩下第一级文字。这意味着如果你有:

some strange<div>other text</div> text 结果字符串为:some strange text

如果您需要更复杂的常规表达式here is useful link to test it

答案 1 :(得分:0)

使用/ td / div / div [3]找到元素后,如果你在这个webelement中执行getText(),它将返回此div /元素中的文本。

相关问题