如何将字符串拆分为文本和数字?

时间:2009-01-09 23:02:17

标签: python string split

我想分割像这样的字符串

'foofo21'
'bar432'
'foobar12345'

['foofo', '21']
['bar', '432']
['foobar', '12345']

有人知道在python中执行此操作的简单方法吗?

10 个答案:

答案 0 :(得分:50)

我会通过以下方式使用re.match来解决这个问题:

match = re.match(r"([a-z]+)([0-9]+)", 'foofo21', re.I)
if match:
    items = match.groups()
    # items is ("foo", "21")

答案 1 :(得分:26)

>>> def mysplit(s):
...     head = s.rstrip('0123456789')
...     tail = s[len(head):]
...     return head, tail
... 
>>> [mysplit(s) for s in ['foofo21', 'bar432', 'foobar12345']]
[('foofo', '21'), ('bar', '432'), ('foobar', '12345')]
>>> 

答案 2 :(得分:19)

>>> r = re.compile("([a-zA-Z]+)([0-9]+)")
>>> m = r.match("foobar12345")
>>> m.group(1)
'foobar'
>>> m.group(2)
'12345'

因此,如果您有一个具有该格式的字符串列表:

import re
r = re.compile("([a-zA-Z]+)([0-9]+)")
strings = ['foofo21', 'bar432', 'foobar12345']
print [r.match(string).groups() for string in strings]

输出:

[('foofo', '21'), ('bar', '432'), ('foobar', '12345')]

答案 3 :(得分:14)

又一个选择:

>>> [re.split(r'(\d+)', s) for s in ('foofo21', 'bar432', 'foobar12345')]
[['foofo', '21', ''], ['bar', '432', ''], ['foobar', '12345', '']]

答案 4 :(得分:8)

我总是提出findall()=)

>>> strings = ['foofo21', 'bar432', 'foobar12345']
>>> [re.findall(r'(\w+?)(\d+)', s)[0] for s in strings]
[('foofo', '21'), ('bar', '432'), ('foobar', '12345')]

请注意,我使用的是比以前大多数答案更简单(更少打字)的正则表达式。

答案 5 :(得分:1)

import re

s = raw_input()
m = re.match(r"([a-zA-Z]+)([0-9]+)",s)
print m.group(0)
print m.group(1)
print m.group(2)

答案 6 :(得分:1)

不使用正则表达式,不使用isdigit()内置函数,仅在起始部分为文本而后半部分为数字的情况下有效

def text_num_split(item):
    for index, letter in enumerate(item, 0):
        if letter.isdigit():
            return [item[:index],item[index:]]

print(text_num_split("foobar12345"))

输出:

['foobar', '12345']

答案 7 :(得分:1)

这是一个简单的函数,用于将任意长度的字符串中的多个单词和数字分隔开,re方法仅分隔前两个单词和数字。我认为这将对以后的所有人有所帮助,

def seperate_string_number(string):
    previous_character = string[0]
    groups = []
    newword = string[0]
    for x, i in enumerate(string[1:]):
        if i.isalpha() and previous_character.isalpha():
            newword += i
        elif i.isnumeric() and previous_character.isnumeric():
            newword += i
        else:
            groups.append(newword)
            newword = i

        previous_character = i

        if x == len(string) - 2:
            groups.append(newword)
            newword = ''
    return groups

print(seperate_string_number('10in20ft10400bg'))
# outputs : ['10', 'in', '20', 'ft', '10400', 'bg'] 

答案 8 :(得分:0)

这是该问题的简单解决方案,不需要键盘上的“猫走路”,我是说regex :))享受^-^

user = input('Input: ') # user = 'foobar12345'
int_list, str_list = [], []

for item in user:
 try:
    item = int(item)  # searching for integers in your string
  except:
    str_list.append(item)
    string = ''.join(str_list)
  else:  # if there are integers i will add it to int_list but as str, because join function only can work with str
    int_list.append(str(item))
    integer = int(''.join(int_list))  # if you want it to be string just do z = ''.join(int_list)

final = [string, integer]  # you can also add it to dictionary d = {string: integer}
print(final)

答案 9 :(得分:0)

这有点长,但是在字符串中有多个随机放置的数字的情况下用途更多。另外,它不需要导入。

timestamp.column.name