如何在python中分隔字母和数字?

时间:2014-08-27 22:00:19

标签: python text

我有这条线

  

HL1110 / 1110R / 1112 / 1112R / MFC1810 / 1810R / 1815 / 1815R / DCP1510 / 1510R / 1512 / 1512R

你可以看到其中一些有HL或其他几个字母,这意味着1110R也来自HL系列,

我试图用“/”分隔线然后查找字符串,但是,我需要从1110开始编写HL,我该怎么做?使用isinstance(x,str)检查两个值(H或1)是否为真,那么如何将它们分隔为字符串和数字?

5 个答案:

答案 0 :(得分:4)

字符串有一个非常好的isdigit方法,你可以使用它。

>>> 'H'.isdigit()
False
>>> '1'.isdigit()
True
>>> '10'.isdigit()
True

答案 1 :(得分:3)

您可以使用isdigit()分开。例如,使用列表理解:

>>> s = "HL1110/1110R/1112/1112R/MFC1810/1810R/1815/1815R/DCP1510/1510R/1512/1512R"
>>> [i for i in s.split('/')]
['HL1110', '1110R', '1112', '1112R', 'MFC1810', '1810R', '1815', '1815R', 'DCP1510', '1510R', '1512', '1512R']
>>> [i for i in s.split('/') if i.isdigit()]
['1112', '1815', '1512']
>>> [i for i in s.split('/') if not i.isdigit()]
['HL1110', '1110R', '1112R', 'MFC1810', '1810R', '1815R', 'DCP1510', '1510R', '1512R']

使用filter是等效的:

>>> filter(lambda x:x.isdigit(), s.split('/'))
['1112', '1815', '1512']
>>> filter(lambda x:not x.isdigit(), s.split('/'))
['HL1110', '1110R', '1112R', 'MFC1810', '1810R', '1815R', 'DCP1510', '1510R', '1512R']

或者,如果您只想要一些字符串,则可以在if部分中使用不同的条件。为了只使用'R'或'HL'字符串,只需更改if条件:

>>> [i for i in s.split('/') if ('R' in i) or ('HL' in i)]
['HL1110', '1110R', '1112R', '1810R', '1815R', '1510R', '1512R']

答案 2 :(得分:2)

我会使用regular expressions来打破系列组件:

from pprint import pprint
import re

line = 'HL1110/1110R/1112/1112R/MFC1810/1810R/1815/1815R/DCP1510/1510R/1512/1512R'

result = {}
series = ''
for item in line.split('/'):
  match = re.match(r'(\D*)(.*)', item)
  if not match:
    print '%s: bad form?'%item
    continue
  i,j = match.groups()
  if i:
    series = i
  result.setdefault(series, []).append(j)
pprint (result)

另一种方法,使用re.findall()代替re.match()

from pprint import pprint
import re

line = 'HL1110/1110R/1112/1112R/MFC1810/1810R/1815/1815R/DCP1510/1510R/1512/1512R'

series = None
result = {}
for maybe_series, item in re.findall('([A-Z]*)([^/]+)', line):
  series = maybe_series or series
  result.setdefault(series, []).append(item)
pprint (result)

答案 3 :(得分:2)

>>> from itertools import groupby
>>> s = "HL1110/1110R/1112/1112R/MFC1810/1810R/1815/1815R/DCP1510/1510R/1512/1512R"
>>> for item in s.split("/"):
...     print ["".join(g) for k,g in groupby(item, str.isdigit)]
... 
['HL', '1110']
['1110', 'R']
['1112']
['1112', 'R']
['MFC', '1810']
['1810', 'R']
['1815']
['1815', 'R']
['DCP', '1510']
['1510', 'R']
['1512']
['1512', 'R']

答案 4 :(得分:0)

re可以将它们拆分为整数和字符串列表:

import re

s = "HL1110/1110R/1112/1112R/MFC1810/1810R/1815/1815R/DCP1510/1510R/1512/1512R"

ints  = re.findall("\d+",s) # one or more digits
st = re.findall("[A-Z]+",s) # one or more uppercase  letters 

print ints,st
['1110', '1110', '1112', '1112', '1810', '1810', '1815', '1815', '1510', '1510', '1512', '1512'] ['HL', 'R', 'R', 'MFC', 'R', 'R', 'DCP', 'R', 'R']

如果你想要一个整体列表:

print map(int,ints)
[1110, 1110, 1112, 1112, 1810, 1810, 1815, 1815, 1510, 1510, 1512, 1512]