删除字符串数字之间的空格

时间:2018-04-25 10:56:14

标签: python regex

我有一个数据框,其中一个列的值中有一些空格(列'地址')。例如: [' 2 47,Philiproad,伦敦,英国' 12 4,Northhall,伦敦,英国']

我的数据中有数千条记录。我怎样才能删除#2;' 2' 2和' 47'例如,使用正则表达式得到以下结果:

[' 247,Philiproad,London,uk' 124,Northhall,London,uk']

4 个答案:

答案 0 :(得分:2)

您可以先用空格替换空格,然后在每个逗号后添加一个空格。 我试过这个:

>>> import re
>>> string1 = '2 47, Philip road, London, uk'
>>> regex = re.compile("(\d )", re.S)
>>> regex.sub(lambda x: x.group()[0].replace(" ", ""), string1)
'247, Philip road, London, uk'

答案 1 :(得分:2)

使用regex

>>>  [re.sub('(?<=\d)+ (?=\d)+', '', ele) for ele in l]

这在正则表达式中使用lookaheadlookbehind的概念。

#driver functions:

IN : ['2 47, Philiproad, London, uk', '12 4, Northhall, London, uk']
OUT : ['247, Philiproad, London, uk', '124, Northhall, London, uk']

答案 2 :(得分:1)

已编辑,因此New York不会转向NewYork

这应该排除address列(此处我假设您的数据框为df):

def replace_if_num(s):
    no_spaces = s.replace(' ', '')
    if no_spaces.isdigit():
        return no_spaces
    return s

def foo(s):
    ', '.join(map(replace_if_num, s.split(',')))

df['address'] = df['address'].map(foo)

答案 3 :(得分:1)

已经给出了好的答案,这里有一个没有lambdare的替代方案:

# input list
lst = ['2 47, Philiproad, London, uk', '12 4, Northhall, London, uk']

# remove a space if it exists before the first comma in the element of the lst
result = [a if ' ' not in a.split(',')[0] else a.replace(' ','',1) for a in lst]

print(result)

输出:

['247, Philiproad, London, uk', '124, Northhall, London, uk']