Question

我使用正则表达式来删除数据。

如果我对数据进行硬编码并将其与正则表达式匹配，则可以正常工作。但是如果我为每个循环使用a，将循环变量传递给re.match()，我会收到以下错误：

     re.VERBOSE
AttributeError: 'NoneType' object has no attribute 'groups'**

我的代码：

trs = soup.findAll("tr")
for tr in trs:
    c = unicodedata.normalize('NFKD', tr.text)
    y.append(str(c))
for x in y:
    #data1 = "Ambala 1.2 Onion 1200 2000 1500"
    x1 =    ([c.strip() for c in re.match(r"""
        (?P<market>[^0-9]+)
        (?P<arrivals>[^ ]+)
        (?P<variety>[^0-9]+)
        (?P<min>[0-9]+)
        \ (?P<max>[0-9]+)
        \ (?P<modal>[0-9]+)""",
        x,
        re.VERBOSE
    ).groups()])

如果我设置data1 = "Ambala 1.2 Onion 1200 2000 1500"，那么它可以正常工作。

任何人都可以告诉我如何在循环中正确迭代它以获取值并避免错误。

Answer 1

我不太明白你要对循环做什么，但我会回答为什么会出现错误。

您似乎正在尝试将字符与该正则表达式匹配。

y.append(str(c))

将一个字符附加到y，然后使用

循环每个字符

for x in y:

正则表达式永远不会匹配1个字符，因为它至少需要8个字符才能匹配。

当re.match()与字符串实际匹配时， object has no attribute 'groups' ，这是您得到的错误。

Answer 2

如果您的数据的结构使您不希望每次都找到匹配项，而只是想收集匹配项。您可以将内联循环分开以构建x1并检查

 x1 = []
 for x in y:
    tmp = re.match(r""" ...""",x)
    try:
       x1 = ([c.strip() for c in tmp.groups])
    except AttributeError:
       print "no match found in x:{}".format(x)

或使用if语句

   ...
   if tmp is not None:
      x1 = ([c.strip() for c in tmp.group])
   else:
      print "no match found in x:{}".format(x)

如果您的数据总是找到一些匹配，那么您的正则表达式格式不正确，您需要调试它。（我发现ipython终端在设计时非常有用于测试正则表达式

AttributeError：'NoneType'对象没有尝试循环的属性'groups'

2 个答案: