正则表达式python同时找到美元金额和少量单词

时间:2017-07-25 19:49:53

标签: python regex search text

我需要在一个段落中同时找到与该金额相关的金额和少量(3或4)个单词。

in-process research and development of $184.3 million and charges $120 of 
million for the impairment of long-lived assets. See Notes 2, 16 and21 to the 
Consolidated Financial Statements. Income from continuingoperations for the       
fiscal year ended September 30, 2001 also includes a netgain on sale of 
businesses and investments of $276.6 million and a net gainon the sale of 
common shares of a subsidiary of $64.1 million.

我想要的是下面的内容,  [金额,金额+数字,金额前的3-4字]。

[$184.3 $184.3 million, research and development of $184.3 million],[$120, $120 of million,charges $120 of 
million for the impairment of long-lived assets ], [$276.6, $276.6 million, investments of $276.6 million] ,[ $64.1, $64.1 million,  a subsidiary of $64.1 million.]

我尝试的是这个,它只找到了金额。

[\$]{1}\d+\.?\d{0,2}

谢谢!

1 个答案:

答案 0 :(得分:2)

所以,让我们给你的模式命名:

amount_patt = r"[\$]{1}[\d,]+\.?\d{0,2}"

然后应使用上述定义数字:

digit_word_patt = amount_patt + r" (\w+)"

现在,对于周围的3-4个单词,请执行以下操作:

words_patt = r"(\S+ ){3, 4}" + amount_patt + r"(\S+ ){3, 4}"

你完成了!现在,只需将这些方法与re方法一起用于字符串提取。