如何从字符串中分离float / double / integer。基本上pandas df中有列值。 e.g:
150.39999389648438cfm
0.30000001192092896inHO
63.70000076293945%
73.0999984741211F
85.9000015258789kW
75.5F
100%
44S
32F
false
NA
true
我的结果集将是没有任何字母/特殊字符的值。
150.39999389648438
0.30000001192092896
63.70000076293945
73.0999984741211
85.9000015258789
75.5
100
44
32
false
true
目前我正在使用re。但我正在寻找更优化的东西。
def is_alphanumeric(inst):
return re.search("(\d)+.(\d)+[a-z]+$", inst)
is_alphanumeric(str(df.iloc[0][i])):
df.iloc[:, i] = df.iloc[:, i].apply(lambda x:float(''.join(ele for ele in x if ele.isdigit() or ele == '.')))
is_percentage(str(df.iloc[0][i])):
df.iloc[:, i] = df.iloc[:, i].apply(lambda x:float(''.join(ele for ele in x if ele.isdigit() or ele == '.')))
答案 0 :(得分:0)
^.*?\b(?:(\d+(?:\.\d+)?|true|false))\b.*?$
$1
<强>解释强>
^ : start of string
.*? : 0 or more any char
\b : word boundary
(?: : non capture group
( : group 1
\d+(?:\.\d+)? : an integer or a float
| : or
true : true
| : or
false : false
) : end group 1
) : end group
\b : word boundary
.*? : 0 or more any char
$ : end of string