我需要一个正则表达式来解析包含分数和操作[+, -, *, or /]
的字符串,并使用re模块中的findall
函数返回包含分子,分母和操作的5元素元组
示例:str = "15/9 + -9/5"
输出格式为[("15","9","+","-9","5")]
我能够想出这个:
pattern = r'-?\d+|\s+\W\s+'
print(re.findall(pattarn,str))
产生["15","9"," + ","-9","5"]
的输出。但是在摆弄了这个时间之后,我无法将其变成5元素元组,并且我无法匹配操作而不匹配它周围的空白区域。
答案 0 :(得分:0)
这种模式可行:
(-?\d+)\/(\d+)\s+([+\-*/])\s+(-?\d+)\/(\d+)
#lets walk through it
(-?\d+) #matches any group of digits that may or may not have a `-` sign to group 1
\/ #escape character to match `/`
(\d+) #matches any group of digits to group 2
\s+([+\-*/])\s+ #matches any '+,-,*,/' character and puts only that into group 3 (whitespace is not captured in group)
(-?\d+)\/(\d+) #exactly the same as group 1/2 for groups 4/5
演示:
>>> s = "15/9 + -9/5 6/12 * 2/3"
>>> re.findall('(-?\d+)\/(\d+)\s([+\-*/])\s(-?\d+)\/(\d+)',s)
[('15', '9', '+', '-9', '5'), ('6', '12', '*', '2', '3')]
答案 1 :(得分:0)
基于正则表达式对字符串进行标记化的一般方法是:
import re
pattern = "\s*(\d+|[/+*-])"
def tokens(x):
return [ m.group(1) for m in re.finditer(pattern, x) ]
print tokens("9 / 4 + 6 ")
注意:
\s*
开头,以传递任何初始空格。|
分隔。\W
时要小心,因为它也会匹配空格。