在单个单词字符串中查找数字字符PYTHON

时间:2016-10-25 15:06:25

标签: python

我在CSV的文本字段中有各种各样的值

有些值看起来像这样 AGM00BALDWIN AGM00BOUCK

然而,有些人有重复,将名称更改为 AGM00BOUCK01 AGM00COBDEN01 AGM00COBDEN02

我的目标是将特定ID写入不包含数字后缀的值

这是迄今为止的代码

prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)

if "*1" not in name and "*2" not in name:
    prov_ID = prov_count + 1
else:
prov_ID = ""

看来通配符不是这里适当的方法,但我似乎无法找到合适的解决方案。

3 个答案:

答案 0 :(得分:1)

有不同的方法,一个使用isdigit函数:

a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

for i in a:
  if i[-1].isdigit():  # can use i[-1] and i[-2] for both numbers
    print (i)


使用regex

import re
a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

pat = re.compile(r"^.*\d$")  # can use "\d\d" instead of "\d" for 2 numbers
for i in a:
  if pat.match(i): print (i)

另一:

for i in a:
    if name[-1:] in map(str, range(10)): print (i)

以上所有方法都返回带有数字后缀的输入:

AGM00BOUCK01
AGM00COBDEN01
AGM00COBDEN02

答案 1 :(得分:1)

在这里使用正则表达式似乎是合适的:

import re

pattern= re.compile(r'(\d+$)')

prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)

if pattern.match(name)==False:
    prov_ID = prov_count + 1
else:
    prov_ID = ""

答案 2 :(得分:0)

您可以使用切片查找元素的最后2个字符,然后检查它是否以'01''02'结尾:

l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

for i in l:
    if i[-2:] in ('01', '02'):
        print('{} is a duplicate'.format(i))

输出:

AGM00BOUCK01 is a duplicate
AGM00COBDEN01 is a duplicate
AGM00COBDEN02 is a duplicate

或者另一种方法是使用str.endswith方法:

l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

for i in l:
    if i.endswith('01') or i.endswith('02'):
        print('{} is a duplicate'.format(i))

所以你的代码看起来像这样:

prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)

if name[-2] in ('01', '02'):
    prov_ID = prov_count + 1
else:
    prov_ID = ""