将字母数字字符串转换为数字字符串

时间:2013-06-06 20:02:41

标签: python string

我环顾四周,我没有看到Python将字母数字字符串转换为数字字符串的明确答案。以下是我想转换的数字示例。

"1234alpha" --> 1234
"a1234asdf" --> 0
"1234.56yt" --> 1234.56

任何建议都将受到赞赏。

DK

7 个答案:

答案 0 :(得分:4)

对于更改itertools而没有正则表达式:

>>> import itertools as it
>>> number = ''.join(it.takewhile(str.isdigit, '123dfd'))
>>> int(number) if number else 0
123
>>> number = ''.join(it.takewhile(str.isdigit, 'a123dfd'))
int(number) if number else 0
0

它对浮子来说有点丑陋:

>>> number = ''.join(it.takewhile(lambda x: x.isdigit() or 
                                   x == '.', '123.45dfd'))
>>> float(number) if number else 0
123.45

浮动,否定:

def make_number(alphanum):
    sign = 1
    if alphanum and alphanum[0] in '+-':
        sign = int(alphanum[0] + '1')
        alphanum = alphanum[1:]
    try:    
        return float(''.join(it.takewhile(lambda x: x.isdigit() 
                                           or x == '.', alphanum))) * sign
    except ValueError:
        return 0

结论:在此过程中更改要求可以将简单的解决方案变为复杂的解决方案。

答案 1 :(得分:1)

要支持正/负整数/浮点数,您可以使用Extract float/double value稍加修改的正则表达式:

import re

re_float = re.compile("""(?x)
   ^
      [+-]?\ *      # first, match an optional sign *and space*
      (             # then match integers or f.p. mantissas:
          \d+       # start out with a ...
          (
              \.\d* # mantissa of the form a.b or a.
          )?        # ? takes care of integers of the form a
         |\.\d+     # mantissa of the form .b
      )
      ([eE][+-]?\d+)?  # finally, optionally match an exponent
   """)

def extract_number(s, default=None):
    m = re_float.match(s)
    if not m:
        return default # no number found
    f = float(m.group(0)) #XXX to support huge numbers, try/except int() first
    return int(f) if f.is_integer() else f

Example

for s in sys.stdin:
    print(extract_number(s, default=0))

输入

1234alpha
a1234asdf
1234.56yt
-1e20.

输出

1234
0
1234.56
-100000000000000000000

答案 2 :(得分:0)

您可以使用re模块:

import re

def alp(s):
    m = re.match('\d+', s)
    return int(m.group(0)) if m is not None and m.start() == 0 else 0

In [3]: alp('a1234asdf')
Out[3]: 0

In [4]: alp('1234alpha')
Out[4]: 1234

如果要包含负整数:

def alp_neg(s):
    m = re.match('[+-]?\d+', s)
    return int(m.group(0)) if m is not None and m.start() == 0 else 0

如果你也想要花车:

def alp_floats(s):
    m = re.match('[+-]?\d+(\.\d+)?', s)
    return float(m.group(0)) if m is not None and m.start() == 0 else 0

In [7]: alp_floats('-12.2ss31.232sadas')
Out[7]: -12.2

答案 3 :(得分:0)

import re
def str_to_int(string):
    match = re.match("\d+", string)
    if match:
        try:            
        return int(match.group())
    except ValueError:
        return float(match.group())
    else:
        return 0

str_to_int("1234alpha") 
1234
str_to_int("a1234asdf") 
0

答案 4 :(得分:0)

import ast
from itertools import takewhile

ast.literal_eval(''.join(takewhile(lambda x: x<='9', string)) or '0')

答案 5 :(得分:0)

如果确定的规则变得难以定义,您可能会考虑尝试查找绑定的二进制搜索方法。

def binsearch_prefix(seq, predicate):
    best_upper = 0
    lower, upper = 0, len(seq)
    while lower < upper:
        mid = (lower + upper) / 2
        if predicate(seq[:mid]):
            best_upper = mid
            lower = mid + 1
        else:
            upper = mid
    return seq[:best_upper]

它将返回您认为可接受的字符串部分。例如,这可能是您的接受函数:

def can_float(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

示例:

print binsearch_prefix(can_float, "1234alpha") # "1234"
print binsearch_prefix(can_float, "a1234asdf") # ""
print binsearch_prefix(can_float, "1234.56yt") # "1234.56"

然后您可以按照自己喜欢的方式格式化前缀。

答案 6 :(得分:-1)

也许使用正则表达式?

import re

def str2num(s):
    try:
        num = re.match(r'^([0-9]+)', s).group(1)
    except AttributeError:
        num = 0
    return int(num)

print str2num('1234alpha')
print str2num('a1234asdf')

输出:

1234
0