我有一个包含此表单元素的列表,字符串可能会更改,但格式保持相似:
["Radio0","Tether0","Serial0/0","Eth0/0","Eth0/1","Eth1/0","Eth1/1","vlanX","modem0","modem1","modem2","modem3","modem6"]
我想将其转换为下面的列表。您可以看到它将删除相同出现的字符串的副本,例如Eth - 只在新列表中出现一次并将数字转换为x和y以更通用:
["RadioX","TetherX","SerialX/Y","EthX/Y","vlanX","modemX"]
我正在搞乱不同的正则表达式,我的方法非常混乱,对你们想到的任何优雅解决方案感兴趣。
以下是一些可以改进的代码,也设置不保留顺序,所以也应该改进:
a = ["Radio0","Tether0","Serial0/0","Eth0/0","Eth0/1","Eth0/2","Eth1/0","vlanX","modem0","modem1","modem2","modem3","modem6"]
c =[]
for i in a:
b = re.split("[0-9]", i)
if "/" in i:
c.append(b[0]+"X/Y")
elif len(b) > 1:
c.append(b[0]+"X")
else:
c.append(b)
print set(c)
set(['modemX', 'TetherX', 'RadioX', 'vlanX', 'SerialX/Y', 'EthX/Y'])
保留订单的设置可能有所改进:
unique=[]
[unique.append(item) for item in c if item not in unique]
print unique
['RadioX', 'TetherX', 'SerialX/Y', 'EthX/Y', 'vlanX', 'modemX']
答案 0 :(得分:2)
以下代码应足够通用,以允许字符串中最多包含3个数字,但您只需更改 repl 变量即可获得更多数字。
import re
elements = ["Radio0","Tether0","Serial0/0","Eth0/0","Eth0/1","Eth1/0","Eth1/1","vlanX","modem0","modem1","modem2","modem3","modem6"]
repl = "XYZ"
for i in range(len(repl)):
elements = [re.sub("[0-9]",repl[i], element, 1) for element in elements]
result = set(elements)
答案 1 :(得分:1)
import re
def particular_case(string):
return re.sub("\d+", "X", re.sub("\d+/\d+", "X/Y", w))
def generic_case(string, letters=['X', 'Y', 'Z']):
len_letters = len(letters)
list_matches = list(re.finditer(r'\d+', string))
result, last_index = "", 0
if len(list_matches) == 0:
return string
for index, match in enumerate(list_matches):
result += string[last_index:
match.start(0)] + letters[index % len_letters]
last_index = match.end(0)
return result
if __name__ == "__main__":
words = ["Radio0", "Tether0", "Serial0/0", "Eth0/0", "Eth0/1", "Eth1/0",
"Eth1/1", "vlanX", "modem0", "modem1", "modem2", "modem3", "modem6"]
result = []
result2 = []
for w in words:
new_value = particular_case(w)
if new_value not in result:
result.append(new_value)
new_value = generic_case(w)
if new_value not in result2:
result2.append(new_value)
print result
print result2
答案 2 :(得分:1)
我使用re.finditer
查找并替换所有数字:
def repl(string):
#use regex to find all numbers
numbers= re.finditer(r'\d+', string)
#replace the numbers with letters. zip will stop when the sequence of
#numbers OR letters runs out.
for match, char in zip(numbers, 'XYZ'): #add more characters if necessary
string= string[:match.start()] + char + string[match.end():]
return string
s= set() #set to keep track of duplicates while maintaining order
result= []
for string in l:
string= repl(string)
if string in s: #ignore if duplicate
continue
#otherwise add to result list
s.add(string)
result.append(string)
这可以替换最多3个号码X
,Y
或Z
可以轻松修改以支持更多。
答案 3 :(得分:1)
你可以去:
import re
rx = r'\d+'
incoming = ["Radio0","Tether0","Serial0/0","Eth0/0","Eth0/1","Eth1/0","Eth1/1","vlanX","modem0","modem1","modem2","modem3","modem6"]
outgoing = []
for item in incoming:
t = re.sub(rx, 'X', item)
if t not in outgoing:
outgoing.append(t)
print(outgoing)
# ['RadioX', 'TetherX', 'SerialX/X', 'EthX/X', 'vlanX', 'modemX']
或者(在强大的Python
列表推导的帮助下,只是另一个语法示例):
import re
rx = re.compile(r'\d+')
incoming = ["Radio0","Tether0","Serial0/0","Eth0/0","Eth0/1","Eth1/0","Eth1/1","vlanX","modem0","modem1","modem2","modem3","modem6"]
def cleanitem(item):
return rx.sub('X', item)
outgoing = []
[outgoing.append(item) \
for item in (cleanitem(x) for x in incoming) \
if item not in outgoing]
print(outgoing)
<小时/> 请参阅a working demo on ideone.com。
答案 4 :(得分:1)
import re
import functools
lst = ["Radio0","Tether0","Serial0/0","Eth0/0","Eth0/1","Eth1/0","Eth1/1","vlanX","modem0","modem1","modem2","modem3","modem6"]
def process_str(s, letters='XY'):
return functools.reduce(lambda txt, letter: re.sub(r'\d+', letter, txt, 1), letters, s)
r = set(map(process_str, lst))
print(r)