我觉得它很简单,但我自己也搞不清楚。我有一系列不连续的项目,如:
farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052, ......
我想把它放在一个列表中()。最简单的方法是什么?为了清楚[er],我认为列表应该是这样的:
myItem = ['farm011','farm012','farm013','farm014','farm020','farm022','farm023','farm024','farm25',....]
我很抱歉,如果已经在这里回答,我没有找到它。提前致谢。干杯!!
<小时/> 更新1: 来自eyquem代码的错误消息
我复制了&amp;完全像你写的那样粘贴代码,这就是我得到的错误:
File "./test.py", line 11
gen = ( ("%s%03d"%(w1,i) for i in range(int(s),int(e)+1)) if w2
^
SyntaxError: invalid syntax
答案 0 :(得分:1)
for rng in ncitems.split(','):
l = re.findall("(\w+\d+)", rng)
if len(l) == 1:
items.extend(l)
elif len(l) == 2:
w1,s,w2,e = re.findall("(\w+)(\d+)", rng) # w1 and w2 should be same...
for i in range(s,e):
items.append("%s%03d"%(w1,i))
答案 1 :(得分:1)
这是一个简单的解决方案:
#!/usr/bin/python
import re
inp = "farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052"
range_re = re.compile("farm(\d+) - farm(\d+)")
items = [i.strip() for i in inp.split(",")]
op_list = []
for i in items:
result = range_re.match(i)
if result:
start = int(result.group(1), 10)
end = int(result.group(2), 10)
for j in range(start, end + 1):
op_list.append("farm%03d" % j)
else:
op_list.append(i)
print op_list
答案 2 :(得分:0)
你看过python-strings-split-with-multiple-separators。
如果您只想在-
和,
之间进行拆分,那么您可以随时for i in string.split(" - "):
然后再if i.index(" , ") != -1
然后拆分我将其添加到{{1} }
答案 3 :(得分:0)
基于WZeberaFFS答案的链接,修改后包含数字:
>>> import re
>>> s="farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052"
>>> re.findall("[\w\d]+",s) #find the words instead of splitting them
['farm011', 'farm018', 'farm020', 'farm022', 'farm033', 'farm041', 'farm052']
>>> re.split(" *[-,] *",s) #another approach, using re.split
['farm011', 'farm018', 'farm020', 'farm022', 'farm033', 'farm041', 'farm052']
答案 4 :(得分:0)
我想纠正vartec的解决方案。
然后,从一次校正到另一次校正,我最终修改了算法,并获得了:
# first code
import re
ncitems = 'farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052'
print 'ncitems :\n',ncitems,'\n\n'
items = []
pat = re.compile("(\w+)(?<!\d)(\d+)(?:[ -]+(\w+)(?<!\d)(\d+))* *(?:,|\Z)")
for w1,s,w2,e in pat.findall(ncitems):
print '(w1,s,w2,e)==',(w1,s,w2,e)
items.extend( ("%s%03d"%(w1,i) for i in range(int(s),int(e)+1))
if w2
else ("%s%s"%(w1,s),) )
print '\nitems :\n',items
结果
ncitems :
farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052
(w1,s,w2,e)== ('farm', '011', 'farm', '018')
(w1,s,w2,e)== ('farm', '020', None, None)
(w1,s,w2,e)== ('farm', '022', 'farm', '033')
(w1,s,w2,e)== ('farm', '041', 'farm', '052')
items :
['farm011', 'farm012', 'farm013', 'farm014', 'farm015', 'farm016', 'farm017', 'farm018', 'farm020', 'farm022', 'farm023', 'farm024', 'farm025', 'farm026', 'farm027', 'farm028', 'farm029', 'farm030', 'farm031', 'farm032', 'farm033', 'farm041', 'farm042', 'farm043', 'farm044', 'farm045', 'farm046', 'farm047', 'farm048', 'farm049', 'farm050', 'farm051', 'farm052']
使用 itertools.chain():
# second code
from itertools import chain
import re
ncitems = 'farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052'
print 'ncitems :\n',ncitems,'\n\n'
pat = re.compile("(\w+)(?<!\d)(\d+)(?:[ -]+(\w+)(?<!\d)(\d+))* *(?:,|\Z)")
gen = ( ("%s%03d"%(w1,i) for i in range(int(s),int(e)+1)) if w2
else ("%s%s"%(w1,s),)
for w1,s,w2,e in pat.findall(ncitems) )
items = list(chain(*gen))
print 'items :\n',items
请注意,如果元素与此类似: far24idi2rm011 ,则所有这些代码仍可正常运行。
我会按如下方式编写Rumple Stiltskin的代码:
import re
inp = "farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052"
range_re = re.compile("farm(\d+) - farm(\d+)")
op_list = []
for result in (range_re.match(i.strip()) for i in inp.split(",")):
if result:
start,end = map(int,result.groups())
for j in range(start, end + 1):
op_list.append("farm%03d" % j)
else:
op_list.append(i)
print op_list
事实上,我不会写Rumple Stiltskin的代码。我的观点是,这是一个糟糕的方法:先分裂(“,”),然后用正则表达式进行搜索。一个合适的正则表达式可以直接匹配所需的内容,那么为什么要通过延迟指令呢?
如果可读性是目标,并且根据我的说法这是一个很好的目标,我认为这段代码最简单,更易读:
import re
ncitems = 'farm011 - farm018, farm020, farm022 - farm033, farm041 - farm052'
print 'ncitems :\n', ncitems
pat = re.compile("(\w+)(?<!\d)(\d+)(?:[ -]+(\w+)(?<!\d)(\d+))* *(?:,|\Z)")
items = []
for w1,s,w2,e in pat.findall(ncitems):
if w2:
items.extend("%s%03d"%(w1,i) for i in xrange(int(s),int(e)+1))
else:
items.append("%s%s"%(w1,s))
print '\nitems :\n',items