搜索文本值然后找到一个数字

时间:2017-10-24 04:40:56

标签: python python-2.7

我想提取薪水和货币。

A= "my name is adam. expected salary usd 5000 USD test test  "
B= "my name is sara. expected salary 8200 MYR test tessksjdkjs "
C= "my name is sara. expected salary IDR 999944 and iam  ksdjfjksdh "

我如何搜索找到预期的工资和货币。

The expected results: 

A= salary 5000     CURRENCY   USD
B= salary 8200     CURRENCY   MYR
C= salary 999944   CURRENCY   IDR

1 个答案:

答案 0 :(得分:0)

这个解决方案假设所需的数据是在文本" salary"之后。在字符串中。

您可以使用正则表达式提取必填字段。然后检查整数值,这些将是工资,其他将是货币。

import re
A= "my name is adam. expected salary usd 5000 USD test test  "
B= "my name is sara. expected salary 8200 MYR test tessksjdkjs "
C= "my name is sara. expected salary IDR 999944 and iam  ksdjfjksdh "
for x in A, B, C:
    required_data = re.findall(r'salary ([^/]+)', x)[0].split(' ')[:2]
    if required_data[0].isdigit():
        print 'salary ' + required_data[0] + ' currency ' + required_data[1]

    else:
        print 'salary ' + required_data[1] + ' currency ' + required_data[0]

输出

salary 5000 currency usd
salary 8200 currency MYR
salary 999944 currency IDR