说明

Question

我正在尝试查找列表中的所有内容，其格式类似于“###### - ##”

我认为我在以下代码中有正确的想法，但它不打印任何东西。我的列表中的某些值具有该格式，我认为它应该打印它。你能告诉我什么是错的吗？

for line in list_nums:
    if (line[-1:].isdigit()):
        if (line[-2:-1].isdigit()):
            if (line[-6:-5].isdigit()):
                if ("-" in line[-3:-2]):
                    print(list_nums)

我列表中的值包含123456-56和123456-98-98等格式，这就是我上面所做的。它是从excel表中提取的。

这是我更新的代码。

import xlrd
from re import compile, match

file_location = "R:/emily/emilylistnum.xlsx"
workbook = xlrd.open_workbook(file_location)
sheet = workbook.sheet_by_index(0)
regexp = compile(r'^\d{d}-\d{2}$')
list_nums = ""


for row in range(sheet.nrows):
    cell = sheet.cell_value(row,0)
    if regexp.match(cell):
        list_nums += cell + "\n"
        print(list_nums)

我的Excel工作表包含： 581094-001 581095-001 581096-001 581097-01 5586987-007 SMX53-5567-53BP 552392-01-01 552392-02 552392-03-01 552392-10-01 552392-10-01 580062 580063 580065 580065 580066 543921-01 556664-55

（在一列中的每个单元格中）

Answer 1

如果您只需匹配模式######-##（其中#是数字）：

>>> from re import compile, match
>>> regexp = compile(r'^\d{6}-\d{2}$')
>>> print([line for line in list_nums if regexp.match(line)])
['132456-78']

说明

将compile模式转换为正则表达式对象，以便在匹配时更高效。正则表达式为^\d{6}-\d{2}$，其中：

^  # start of the line
\d{6}-\d{2}  # 6 digits, one dot then 2 digits
$  # end of the line

在正则表达式中，\d表示数字（0到9之间的整数），{6}表示6次。所以\d{3}表示3位数。您应该阅读有关regexp s。

的Python文档

完整代码

基于您的评论的示例：

file_location = 'file.xlsx'
workbook = xlrd.open_workbook(file_location)
sheet = workbook.sheet_by_index(0)
regexp = compile(r'^\d{6}-\d{2}$')

list_nums = ''
for row in range(sheet.nrows):
    cell = sheet.cell_value(row, 0)
    if regexp.match(cell):
        list_nums += cell + "\n"

Answer 2

您的代码似乎正在做正确的事情，除了您希望打印行的值而不是 list_nums的值< / em>的。

手头任务的另一种方法是使用正则表达式，这是模式识别的理想选择。

编辑：代码现在假定list_nums是单一字符串

import re rx = re.compile('\d{6}-\d{2}\Z') for line in list_nums.split('\n'): if rx.match(line): print line

查找列表中与特定格式匹配的所有项目

2 个答案:

说明

完整代码