用字符替换单词中的数字

时间:2019-05-13 04:09:20

标签: python regex

我有一个像这样的字符串:

place2 <- "AAAAA"
group_cols2 <- c("City" = "blue", setNames("green",place2))



df2 <- df %>% 
  filter(place %in% c("City", place2))

ggplot(df2, aes(x, y, fill = place)) +
  geom_col() + scale_fill_manual(values=group_cols2)

使用s ="Question1: a12 is the number of a, 1b is the number of b" 我可以得到

x = re.compile('\w+').findall(s)

现在我要替换单词中的数字,例如

  • ['Question1', 'a12', 'is', 'the', 'number', 'of', 'a', '1b', 'is', 'the', 'number', 'of', 'b'] -> Question1
  • Question$a12-> 1ba$

我尝试过$b

但它返回用y = [re.sub(r'\w*\d\w*', '$', x) for w in x]代替的整个单词:

$

我想问一下是否有一种方法可以正确替换它,并且如果可能的话,将查找和替换合并在一个函数中。

4 个答案:

答案 0 :(得分:2)

您可以调整以下示例以满足您的要求:

如果要替换的数字仅位于单词的末尾:

import re

s = "Question1: a12 is the number of a, 1b is the number of b, 123"
x = re.compile('\w+').findall(s)
y = [re.sub(r'(?<=[a-zA-Z])\d+$', '$', w) for w in x]
print(y)

输出:

['Question$', 'a$', 'is', 'the', 'number', 'of', 'a', '1b', 'is', 'the', 'number', 'of', 'b', '123']

一步(以字符串形式显示):

import re
s ="Question1: a12 is the number of a, 1b is the number of b, abc1uvf"
pat = re.compile(r'(?<=[a-zA-Z])\d+(?=\W)')
print(re.sub(pat, "$", s))

输出:

Question$: a$ is the number of a, 1b is the number of b, abc1uvf

如果数字可以在单词中使用的任何位置:

import re

s = "Question1: a12 is the number of a, 1b is the number of b, 123"
x = re.compile('\w+').findall(s)
y = [re.sub(r'\d+', '$', w) for w in x]
print(y)

输出:

['Question$', 'a$', 'is', 'the', 'number', 'of', 'a', '$b', 'is', 'the', 'number', 'of', 'b', '$']

请注意,如果您不想使用123,请用$代替:

import re

s = "Question1: a12 is the number of a, 1b is the number of b, 123"
x = re.compile('\w+').findall(s)
y = [re.sub(r'(?<=[a-zA-Z])\d+|\d+(?=[a-zA-Z])', '$', w) for w in x]
print(y)

输出:

['Question$', 'a$', 'is', 'the', 'number', 'of', 'a', '$b', 'is', 'the', 'number', 'of', 'b', '123']

第一步:

import re

s = "Question1: a12 is the number of a, 1b is the number of b, 123"
y = re.sub(r'(?<=[a-zA-Z])\d+|\d+(?=[a-zA-Z])', '$', s)
print(y)

答案 1 :(得分:1)

尝试一下:

import re
s ="Question1: a12 is the number of a, 1b is the number of b"
pat = re.compile("[0-9]+")
print(re.sub(pat, "$", s))

答案 2 :(得分:1)

说明:

  • re.sub的第一个参数是您要替换的数字。

    \d+找到一个数字+,该数字表示一个或多个事件 的数字。

  • 第二个参数采用替换模式的内容。在这种情况下 其'$'

  • 第三个参数采用输入字符串。

这可以根据需要进行操作:

import re
s ="Question1: a12 is the number of a, 1b is the number of b"
print(re.sub('\d+', '$', s))

输出:

Question$: a$ is the number of a, $b is the number of b

答案 3 :(得分:1)

尝试一下:

import re
x = ['Question1', 'a12', 'is', 'the', 'number', 'of', 'a', '1b', 'is', 'the', 'number', 'of', 'b']
y = [re.sub(r'\d+', '$', w) for w in x]
print(y)

输出:

['Question$', 'a$', 'is', 'the', 'number', 'of', 'a', '$b', 'is', 'the', 'number', 'of', 'b']