Xpath - 和乘法

时间:2014-04-10 12:45:39

标签: xpath

我使用Xpath抓取网页,我需要将存款写为数字。 存款需要("每月租金" x"预付租金金额") 结果应该是:在这种情况下15450

<table>
<tr>
<td>monthly rent: </td>
<td>5.150,00</td>
</tr>
<tr>
<td>deposit: </td>
<td>3 mdr.</td>
</tr>
</table>

我目前正在使用以下XPath来查找信息:

//td[contains(.,'Depositum') or contains(.,'Husleje ')]/following-sibling::td/text()

但我不知道如何删除&#34; mdr。&#34;从存款,以及如何将数字乘以数字,只返回1数字到数据库。

2 个答案:

答案 0 :(得分:3)

您可以使用以下与XPath 1.0及更高版本兼容的查询:

substring-before(//td[contains(.,'deposit:')]/following-sibling::td/text(), ' mdr.') * translate(//td[contains(.,'monthly rent:')]/following-sibling::td/text(), ',.', '') div 100

输出:

15450

逐步说明:

// Get the deposit and remove mdr. from it using substring-before
substring-before(//td[contains(.,'deposit:')]/following-sibling::td/text(), ' mdr.')

// Arithmetic multiply operator
* 

// The number format 5.150,00 can't be used for arithmetic calculations.
// Therefore we get the monthly rent and remove . and , chars from it.
// Note that this is equal to multiply it by factor 100. That's why we divide
// by 100 later on.
translate(//td[contains(.,'monthly rent:')]/following-sibling::td/text(), ',.', '')

// Divide by 100
div 100

您可以参考List of Functions and Operators supported by XPath 1.0 and 2.0

答案 1 :(得分:0)

Pure XPath解决方案:

translate(
    /table/tr/td[contains(., 'monthly rent')]/following-sibling::td[1],
    ',.',
    '.'
)
* 
substring-before(
    /table/tr/td[contains(., 'deposit')]/following-sibling::td[1],
    ' mdr'
)

似乎我得到的解决方案非常类似于hek2mgl的正确答案,但不需要用100除以(逗号转换为点,删除点)和包含数字数据的<td>元素具有位置谓词如果实际表不像给定示例那么简单,则为了避免匹配更多元素。 XPath数字格式要求小数点分隔符为点而不是千位分隔符。