从字符串python中删除数字

时间:2018-08-23 18:03:46

标签: python string python-3.x for-loop

除了字典中给定的数字外,我想从字符串中删除所有数字。我已经编写了将其删除的代码 但没有得到预期的结果。请在下面看看:

mystr=" hey I want to delete all the digits ex 600 502 700 m 8745 given in this string. And getting request from the ip address 122521587502. This string tells about deleting digits 502 600 765 from this."

myDict={'600','700'} # set()

# snippet to remove digits from string other than digits given in the myDict 

我的解决方案

for w in myDict:
    for x in mystr.split():
        if (x.isdigit()):
            if(x != w):
                mystr.replace(x," ")

预期结果:

mystr=" hey I want to delete all the digits ex 600 700 m  given in this string. And getting request from the  
ip address . This string tells about deleting digits  600  from this."

7 个答案:

答案 0 :(得分:3)

这是一种方法。

例如:

import string

mystr= "hey I want to delete all the digits ex 600 502 700 m 8745 given in this string. And getting request from the  ip address 122521587502. This string tells about deleting digits 502 600 765 from this."
mySet={'600','700'}

rep = lambda x: x if x in mySet else None
print( " ".join(filter(None, [rep(i) if i.strip(string.punctuation).isdigit() else i for i in mystr.split()])) )

输出:

hey I want to delete all the digits ex 600 700 m given in this string. And getting request from the ip address This string tells about deleting digits 600 from this.

答案 1 :(得分:2)

这是另一种选择。它在点上添加空格,但也删除ip address之后的数字。在其他解决方案中无法执行此操作,因为数字后有点。

import re

mystr= "hey I want to delete all the digits ex 600 502 700 m 8745 given in this 
    string. And getting request from the  ip address 122521587502. This string 
    tells about deleting digits 502 600 765 from this."

myDict={'600','700'}

print(" ".join("" if (i.isdigit() and i not in myDict) \
    else i for i in re.findall(r'(?:\w+|\d+|\S)', mystr)))

输出:

hey I want to delete all the digits ex 600  700 m  given in this string . And 
getting request from the ip address  . This string tells about deleting digits  
600  from this .

PS: 有一种糟糕的方法来固定点的空间:

print("".join("" if (i.isdigit() and i not in myDict) \
    else i if i == '.' or i == ',' \
    else ''.join([' ', i]) for i in re.findall(r'(?:\w+|\d+|\S)', mystr))
    .strip())

哪个产生输出:

hey I want to delete all the digits ex 600 700 m given in this string. And 
getting request from the ip address. This string tells about deleting digits 
600 from this.

答案 2 :(得分:1)

In [1]: mystr=" hey I want to delete all the digits ex 600 502 700 m 8745 given in this string. And getting request from the ip address
   ...:  122521587502. This string tells about deleting digits 502 600 765 from this."
   ...: myDict={'600','700'}

首先,您可以准备删除数据:

   ...: mystr_l = mystr.replace('.', "").split()
   ...: to_remove = sorted(list({x for x in set(mystr_l) if x.isdigit() and x not in myDict}))
   ...: 
   ...: print(to_remove)
['122521587502', '502', '765', '8745']

,然后将其从字符串中删除:

In [4]: for x in to_remove:
   ...:     mystr = mystr.replace(x, " ")
   ...:  

我的结果是:

In [6]: print(mystr)
hey I want to delete all the digits ex 600  700 m  given in this string. 
And getting request from the ip addres . This string tells about deleting digits  600  from this.

一些性能测试:

def replace_digits(src_string, exclude_list):
    result = src_string
    string_l = src_string.replace('.', "").split()
    to_remove = sorted(list({x for x in set(string_l) if x.isdigit() and x not in exclude_list}))
    for x in to_remove:
        result = result.replace(x, "")
    return result

import re

def reg(src_string, exclude_list):
    return " ".join("" if (i.isdigit() and i not in exclude_list) \
                    else i for i in re.findall(r'(?:\w+|\d+|\S)', src_string))

测试:

In [8]: %timeit replace_digits(mystr, mySet)
11.3 µs ± 31.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [9]: %timeit reg(mystr, mySet)
    ...: 
25.1 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

答案 3 :(得分:0)

您可以使用<table class="sturdy"> <thead> <div class="test"> <tr> <th>ID</th> <th>First Name</th> <th>Middle Name</th> <th>Last Name</th> <th>Date of Birth</th> </tr> </div> </thead> <tbody> <div class="test" *ngFor="let attribute of attributes"> <tr> <td>fooooooooooooo</td> <td>foo</td> <td></td> <td>foo</td> <td></td> </tr> </div> </tbody> </table>进行此操作。匹配任何数字,并使用可调用的替换器仅过滤掉不需要的数字。

使用re.sub存储要保留的数字序列,然后在遍历字符串时允许 O(1)查找。

代码

set

示例

import re

def remove_numbers(s, keep=None):
    if keep:
        keep = set(str(x) for x in keep)
        return re.sub(r'\b\d+\b', lambda m: m.group() if m.group() in keep else '', s)
    else:
        # Shortcircuit the use of a set if there is no sequence to keep
        return re.sub(r'\b\d+\b', '', s)

输出

allowed = {600, 700}
s = 'I want to delete this: 100 200. But keep this: 600 700'
print(remove_numbers(s, allowed))

答案 4 :(得分:0)

您可以将这样的简单代码与布尔逻辑和字符串操作的基本功能一起使用。

mystr= "hey I want to delete all the digits ex 600 502 700 m 8745 given in this string. And getting request from the  ip address 122521587502. This string tells about deleting digits 502 600 765 from this."
myDict={'600','700'}

print( " ".join(c if not(bool(c.isdigit()) ^ bool(c in myDict)) else "" for c in mystr.split()) )

但是问题是,在上面的示例中,这将不考虑句号或其他特殊字符(例如122521587502.)附带的边沿数字。因此,如果仍然需要考虑这些因素,则可以使用带有正则表达式模式匹配的自定义函数来代替isdigit(),并编写一些复杂的代码以获得所需的结果。这是一个考虑数字以句号和逗号结尾的示例。

^[0-9]*[\,\.]?$可以用作正则表达式模式来匹配上述情况。 (您可以使用this工具轻松调试正则表达式模式)。因此,代码段如下:

import re

isNum = lambda c: True if re.match("^[0-9]*[\,\.]?$",c) else False
func = lambda c: True if re.compile("[\,\.]").split(c) in myDict else False

print(" ".join(c if not(isNum(c) ^ func(c)) else "" for c in mystr.split()))

答案 5 :(得分:0)

这显示了用于预处理和去除数字的组合算法。此程序还可以处理122521587502.边缘情况和12.5浮点值(如果它们恰好在输入字符串中)

代码

exclude_set = {'600', '700'}

mystr=' hey I want to delete all the digits ex 600 502 700 8745 given in this string. And getting request from the ip address 122521587502. 12.5 This string tells about deleting digits 502 600 765 from this.'

# Pre-process the string converting all whitespaces to single spaces
mystr = " ".join(mystr.split())

# Check if the word is both a digit and to be excluded
# Also catch any floats and full-stops
mystr_list = mystr.split()
for word in mystr_list:
  if word.replace('.', '').isdigit() and word not in exclude_set:
    # Replace word or remove digit
    if word.endswith('.'):
      mystr_list[mystr_list.index(word)] = '.'
    else:
      mystr_list.remove(word)

# Combine the list to form your string
mystr = ' '.join(mystr_list)
print (mystr)

答案 6 :(得分:0)

您不需要使自己复杂化。只需通过执行dict(zip(mySet, mySet))来确保import re mySet1 =dict(zip(mySet, mySet)) re.sub("\\d+", lambda x:mySet1.get(x.group()), mystr) Out[604]: 'hey I want to delete all the digits ex 600 700 m given in this string. And getting request from the ip address . This string tells about deleting digits 600 from this.' 是字典,然后使用它来替换:

var map = new Dictionary<string,Func<INotificationService>>
{
    { "EMAIL" : () => new EmailService() },
    { "PHONE" : () => new PhoneService() }
};