Question

如何从字符串中分离float / double / integer。基本上pandas df中有列值。 e.g：

150.39999389648438cfm
0.30000001192092896inHO
63.70000076293945%
73.0999984741211F
85.9000015258789kW
75.5F
100%
44S
32F
false
NA
true

我的结果集将是没有任何字母/特殊字符的值。

150.39999389648438
0.30000001192092896
63.70000076293945
73.0999984741211
85.9000015258789
75.5
100
44
32
false

true

目前我正在使用re。但我正在寻找更优化的东西。

def is_alphanumeric(inst):
    return re.search("(\d)+.(\d)+[a-z]+$", inst)
is_alphanumeric(str(df.iloc[0][i])):
            df.iloc[:, i] = df.iloc[:, i].apply(lambda x:float(''.join(ele for ele in x if ele.isdigit() or ele == '.')))


is_percentage(str(df.iloc[0][i])):
df.iloc[:, i] = df.iloc[:, i].apply(lambda x:float(''.join(ele for  ele  in  x  if  ele.isdigit()  or  ele == '.')))

Answer 1

搜索：^.*?\b(?:(\d+(?:\.\d+)?|true|false))\b.*?$
替换为：$1

<强>解释

^                   : start of string
.*?                 : 0 or more any char
\b                  : word boundary
(?:                 : non capture group
  (                 : group 1
    \d+(?:\.\d+)?   : an integer or a float
    |               : or
    true            : true
    |               : or
    false           : false
  )                 : end group 1
)                   : end group
\b                  : word boundary
.*?                 : 0 or more any char
$                   : end of string

从列值中分离float / double / integer

1 个答案: