简答

Question

我已经跟踪了SCons中D源的依赖逻辑中的错误。

self.cre regexp import\s+(?:\[a-zA-Z0-9_.\]+)\s*(?:,\s*(?:\[a-zA-Z0-9_.\]+)\s*)*; in SCons.Scanner.D不包括......

等模式

import IMPORT_PATH : SYMBOL;

...只有：

import IMPORT_PATH;

与self.cre2 regexp (?:import\s)?\s*([a-zA-Z0-9_.]+)\s*(?:,|;) two lines later相同。

我认为self.cre和self.cre2正则表达式都需要修复;但我不太明白他们是如何相关的。我的猜测是self.cre匹配整个import语句，self.cre2匹配它们的一部分。我对么？如果需要，self.cre2需要更正以处理如下情况：

import X, Y, Z;

有没有人知道如何修复regexp以便他们处理这些情况？

我的第一次尝试是改变

p = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:,\s*(?:[a-zA-Z0-9_.]+)\s*)*;'

到

p = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:,\s*(?:[a-zA-Z0-9_.]+)(?:\s*:\s*[a-zA-Z0-9_.]+)??\s*)*;'

我试过调试但是徒劳无功。

的Python：

import re
p = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:,\s*(?:[a-zA-Z0-9_.]+)\s*)*;'

re.match(p, "import first;") # match
re.match(p, "import first : f;") # no match

p2 = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:,\s*(?:[a-zA-Z0-9_.]+)(?:\s*:\s*[a-zA-Z0-9_.]+)??\s*)*;'

re.match(p2, "import first;") # match
re.match(p2, "import first : f;") # no match but should match
re.match(p2, "import first : f, second : g;") # no match but should match

Answer 1

简答

要处理您概述的所有案例，请尝试对（self.cre）模式的更改进行以下转折：

import\s+(?:[a-zA-Z0-9_.]+)\s*(?:(?:\s+:\s+[a-zA-Z0-9_.]+\s*)?(?:,\s*(?:[a-zA-Z0-9_.]+)(?:\s*:\s*[a-zA-Z0-9_.]+)??\s*)*)*;

Regular expression visualization

Debuggex Demo

深入挖掘

self.cre vs. self.cre2

是的，find_include_names方法......

def find_include_names(self, node):
    includes = []
    for i in self.cre.findall(node.get_text_contents()):
        includes = includes + self.cre2.findall(i)
    return includes

...确认您猜到的self.cre和self.cre2之间的关系：前者匹配整个import语句，后者匹配（和捕获）其中的模块。（请注意(与)中的self.cre2 ... (?: 捕获组) 非 - 在self.cre和self.cre2中的其他地方使用-capture -...

self.cre

拾起where your Python snippet left off ...

import re

import1 = "import first;"
import2 = "import first : f;"
import3 = "import first : f, second : g;"


p = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:,\s*(?:[a-zA-Z0-9_.]+)\s*)*;'

pm1 = re.match(p, import1) # match
if pm1 != None:
    print "p w/ import1 => " + pm1.group(0)

pm2 = re.match(p, import2) # no match
if pm2 != None:
    print "p w/ import2 => " + pm2.group(0)


p2 = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:,\s*(?:[a-zA-Z0-9_.]+)(?:\s*:\s*[a-zA-Z0-9_.]+)??\s*)*;'

p2m1 = re.match(p2, import1) # match
if p2m1 != None:
    print "p2 w/ import1 => " + p2m1.group(0)

p2m2 = re.match(p2, import2) # no match but should match
if p2m2 != None:
    print "p2 w/ import2 => " + p2m2.group(0)

p2m3 = re.match(p2, import3) # no match but should match
if p2m3 != None:
    print "p2 w/ import3 => " + p2m3.group(0)

...，我们得到p和p2尝试匹配导入语句的以下预期输出：

p w/ import1 => import first;
p2 w/ import1 => import first;

现在考虑p2prime，其中我已经在changes to arrive at the pattern I suggested上面做了：

import re

import1 = "import first;"
import2 = "import first : f;"
import3 = "import first : f, second : g;"
import4 = "import first, second, third;"

p2prime = 'import\s+(?:[a-zA-Z0-9_.]+)\s*(?:(?:\s+:\s+[a-zA-Z0-9_.]+\s*)?(?:,\s*(?:[a-zA-Z0-9_.]+)(?:\s*:\s*[a-zA-Z0-9_.]+)??\s*)*)*;'

p2pm1 = re.match(p2prime, import1) # match
if p2pm1 != None:
    print "p2prime w/ import1 => " + p2pm1.group(0)

p2pm2 = re.match(p2prime, import2) # now a match
if p2pm2 != None:
    print "p2prime w/ import2 => " + p2pm2.group(0)

p2pm3 = re.match(p2prime, import3) # now a match
if p2pm3 != None:
    print "p2prime w/ import3 => " + p2pm3.group(0)

p2pm4 = re.match(p2prime, import4) # now a match
if p2pm4 != None:
    print "p2prime w/ import4 => " + p2pm4.group(0)

使用更新的模式（p2prime），我们可以获得以下所需的输出，以便尝试匹配import语句：

p2prime w/ import1 => import first;
p2prime w/ import2 => import first : f;
p2prime w/ import3 => import first : f, second : g;
p2prime w/ import4 => import first, second, third;

这是一个非常冗长且涉及的模式：所以我不会惊讶地发现有机会进一步微调它;但它做你想做的事情，应该为微调提供坚实的基础。

self.cre2

对于self.cre2，同样尝试以下模式：

(?:import\s)?\s*(?:([a-zA-Z0-9_.]+)(?:\s+:\s+[a-zA-Z0-9_.]+\s*)?)\s*(?:,|;)

Regular expression visualization

Debuggex Demo

但请记住，自D's <module> : <symbol> selective imports以来只有选择性，在选择性导入中捕获模块名称可能不是您最终需要的（例如，捕获模块和选定的符号名称）。正如我对我所建议的self.cre正则表达式进行了类似的解释，进一步微调的地方应该是不合理的。

SCons中的D依赖逻辑错误

1 个答案:

简答

深入挖掘

self.cre vs. self.cre2

self.cre

self.cre2