Question

我想从字符串中获取一些整数（第三个）。不使用正则表达式优先。

我看到了很多东西。

我的字符串：

xp = '93% (9774/10500)'

所以我希望代码返回一个包含字符串整数的列表。所以期望的输出是：[93, 9774, 10500]

这样的一些东西不起作用：

>>> new = [int(s) for s in xp.split() if s.isdigit()]
>>> print new
[]
>>> int(filter(str.isdigit, xp))
93977410500

Answer 1

由于问题是你必须拆分不同的字符，你可以先用空格替换不是数字的所有内容然后拆分，单行将是：

 xp = '93% (9774/10500)'
 ''.join([ x if x.isdigit() else ' ' for x in xp ]).split() # ['93', '9774', '10500']

Answer 2

使用正则表达式（抱歉！）将字符串拆分为非数字，然后过滤数字（可以有空字段）并转换为int。

import re

xp = '93% (9774/10500)'

print([int(x) for x in filter(str.isdigit,re.split("\D+",xp))])

结果：

[93, 9774, 10500]

Answer 3

由于这是Py2，使用sudo pip install matplotlib --upgrade，看起来你不需要考虑完整的Unicode范围;由于您不止一次这样做，您可以使用str在polku's answer上略微改善：

str.translate

性能方面，对于我的64位Linux 2.7版本上的测试字符串，使用# Create a translation table once, up front, that replaces non-digits with import string nondigits = ''.join(c for c in map(chr, range(256)) if not c.isdigit()) nondigit_to_space_table = string.maketrans(nondigits, ' ' * len(nondigits)) # Then, when you need to extract integers use the table to efficiently translate # at C layer in a single function call: xp = '93% (9774/10500)' intstrs = xp.translate(nondigit_to_space_table).split() # ['93', '9774', 10500] myints = map(int, intstrs) # Wrap in `list` constructor on Py3运行大约需要374纳秒，而listcomp和translate解决方案需要2.76微秒; listcomp + join的使用时间延长了7倍。对于较大的字符串（其中固定开销与实际工作相比是微不足道的），listcomp + join解决方案的时间将延长近20倍。

polku解决方案的主要优点是它不需要对Py3进行任何更改（它应该无缝地支持非ASCII字符串），其中join以不同的方式构建转换表（str.translate）并且制作一个处理整个Unicode空间中所有非数字的转换表是不切实际的。

Answer 4

由于格式已修复，您可以使用连续的split()。它不是很漂亮，也不是一般，但有时直接和“愚蠢”的解决方案并不是那么糟糕：

a, b = xp.split("%")
x = int(a)
y = int(b.split("/")[0].strip()[1:])
z = int(b.split("/")[1].strip()[:-1])
print(x, y, z) # prints "93 9774 10500"

修改：澄清海报明确表示他的格式是已修复。这个解决方案并不是很漂亮，但它可以实现预期的目标。

不使用正则表达式从字符串中获取整数的最佳方法

4 个答案: