我在操作字符串列表方面寻求帮助,我想要提取数字,例如:
x = ['aa bb qq 2 months 60%', 'aa bb qq 3 months 70%', 'aa bb qq 1 month 80%']
我想去:
[[2.0,60.0],[3.0,70.0],[1.0,80.0]]
以优雅的方式。
第一个数字应该始终是整数,但第二个数字可以是带小数值的浮点数
我的肮脏工作是:
x_split = [y.replace("%", "").split() for y in x]
x_float = [[float(s) for s in x if s.isdigit()] for x in x_split]
Out[100]: [[2.0, 60.0], [3.0, 70.0], [1.0, 80.0]]
答案 0 :(得分:7)
使用regular expression匹配整数和浮点数。
import re
[[float(n) for n in re.findall(r'\d+\.?\d*', s)] for s in x]
正则表达式(r'\d+\.?\d*'
)的说明:
r # a raw string so that back slashes are not converted
\d # digit 0 to 9
+ # one or more of the previous pattern (\d)
\. # a decimal point
? # zero or one of the previous pattern (\.)
\d # digit 0 to 9
* # zero or more of the previous pattern (\d)