Question

如何在python中删除zipcodes中的+4？

我有像

这样的数据

85001
52804-3233
Winston-Salem

我希望那成为

85001
52804
Winston-Salem

Answer 1

>>> zip = '52804-3233'
>>> zip[:5]
'52804'

...当然，当你从原始数据中解析你的行时，你应该插入某种规则来区分zipcode到fix和其他字符串，但我不知道你的数据是怎样的，所以我可以帮助不大（你可以检查它们是否只是数字和' - '符号，也许？）。

Answer 2

>>> import re
>>> s = "52804-3233"
>>> # regex to remove a dash and 4 digits after the dash after 5 digits:
>>> re.sub('(\d{5})-\d{4}', '\\1', s)
'52804'

\\1是一个所谓的后向引用，并被第一组取代，在这种情况下，这将是5位数的邮政编码。

Answer 3

您可以尝试这样的事情：

for input in inputs:
    if input[:5].isnumeric():
        input = input[:5]
        # Takes the first 5 characters from the string

只需删除前5个位置中任意数字的前5个字符。

Answer 4

re.sub('-\d{4}$', '', zipcode)

Answer 5

这会抓取格式00000-0000的所有项目，并在数字前后加上空格或其他字边界，并将其替换为前五位数字。发布的其他正则表达式将匹配您可能不需要的其他一些数字格式。

re.sub('\b(\d{5})-\d{4}\b', '\\1', zipcode)

Answer 6

或没有正则表达式：

output = [line[:5] if line[:5].isnumeric() and line[6:].isnumeric() else line for line in text if line]

删除 - 拉链码中的####

6 个答案: