According to guido(以及其他一些Python程序员),隐式字符串文字串联被认为是有害的。因此,我试图识别包含这种串联的逻辑行。
我的第一次(也是唯一一次)尝试是使用shlex
;我想过用posix=False
拆分逻辑行,所以我会识别用引号封装的部分,如果它们彼此相邻,它将被视为“文字连接”。
但是,这在多行字符串上失败,如下例所示:
shlex.split('""" Some docstring """', posix=False)
# Returns '['""', '" Some docstring "', '""']', which is considered harmful, but it's not
我可以调整这是一些奇怪的特殊方式,但我想知道你是否能想到一个简单的解决方案。我的目的是将它添加到我已经扩展的pep8
验证程序。
答案 0 :(得分:3)
有趣的问题,我只是不得不玩它并且因为没有答案我正在发布我的问题的解决方案:
#!/usr/bin/python
import tokenize
import token
import sys
with open(sys.argv[1], 'rU') as f:
toks = list(tokenize.generate_tokens(f.readline))
for i in xrange(len(toks) - 1):
tok = toks[i]
# print tok
tok2 = toks[i + 1]
if tok[0] == token.STRING and tok[0] == tok2[0]:
print "implicit concatenation in line " \
"{} between {} and {}".format(tok[2][0], tok[1], tok2[1])
您可以自行提供程序,结果应为
implicit concatenation in line 14 between "implicit concatenation in line " and "{} between {} and {}"
答案 1 :(得分:2)
我决定使用来自user2357112的建议,并稍微扩展一下以推导出以下解决方案,我在此将其描述为pep8
模块的扩展:
def python_illegal_concetenation(logical_line):
"""
A language design mistake from the early days of Python.
https://mail.python.org/pipermail/python-ideas/2013-May/020527.html
Okay: val = "a" + "b"
W610: val = "a" "b"
"""
w = "W610 implicit string literal concatenation considered harmful"
sio = StringIO.StringIO(logical_line)
tgen = tokenize.generate_tokens(sio.readline)
state = None
for token_type, _, (_, pos), _, _ in tgen:
if token_type == tokenize.STRING:
if state == tokenize.STRING:
yield pos, w
else:
state = tokenize.STRING
else:
state = None
答案 2 :(得分:0)
更好地处理这个问题的一个想法是,当你有一个清单时,在关闭报价之后加一个空格(或两个):
aList = [
'one' ,
'two' ,
'three'
'four' ,
]
现在更明显的是'三'缺少其尾随逗号
提议:我建议python有一个 pragma ,表示区域中禁止使用字符串文字连接:
@nostringliteralconcat
a = "this" "and" "that" # Would cause a compiler failure
@stringliteralconcat
a = "this" "and" "that" # Successfully Compiles
允许连接是默认(保持兼容性)
还有这个帖子:
https://groups.google.com/forum/#!topic/python-ideas/jP1YtlyJqxs%5B1-25%5D