我想提取薪水和货币。
A= "my name is adam. expected salary usd 5000 USD test test "
B= "my name is sara. expected salary 8200 MYR test tessksjdkjs "
C= "my name is sara. expected salary IDR 999944 and iam ksdjfjksdh "
我如何搜索找到预期的工资和货币。
The expected results:
A= salary 5000 CURRENCY USD
B= salary 8200 CURRENCY MYR
C= salary 999944 CURRENCY IDR
答案 0 :(得分:0)
这个解决方案假设所需的数据是在文本" salary"之后。在字符串中。
您可以使用正则表达式提取必填字段。然后检查整数值,这些将是工资,其他将是货币。
import re
A= "my name is adam. expected salary usd 5000 USD test test "
B= "my name is sara. expected salary 8200 MYR test tessksjdkjs "
C= "my name is sara. expected salary IDR 999944 and iam ksdjfjksdh "
for x in A, B, C:
required_data = re.findall(r'salary ([^/]+)', x)[0].split(' ')[:2]
if required_data[0].isdigit():
print 'salary ' + required_data[0] + ' currency ' + required_data[1]
else:
print 'salary ' + required_data[1] + ' currency ' + required_data[0]
输出
salary 5000 currency usd
salary 8200 currency MYR
salary 999944 currency IDR