我正在编写一个函数来查找系统上发生的进程的名称。我接受这样的数组:
['\\\\TEST-PC\\Process(python)\\Operations/sec',
'\\\\TEST-PC\\Process(process#2)\\Operations/sec',
'\\\\TEST-PC\\Process(process#1)\\Operations/sec',
'\\\\TEST-PC\\Process(process)\\Operations/sec',
'\\\\TEST-PC\\Process(python)\\Thread Count',
'\\\\TEST-PC\\Process(process#2)\\Thread Count',
'\\\\TEST-PC\\Process(process#1)\\Thread Count',
'\\\\TEST-PC\\Process(process)\\Thread Count'....etc....]
我想在这样的数组中输出每个进程的名称:
['python','process#2','process#1','process']
(请注意,如果一个进程在原始数组中出现多次,我不希望在输出数组中出现重复)
这是我到目前为止所做的:
def count_no_of_processes(row_to_check):
#Ignore first entry
to_search= row_to_check[1:]
processes=[]
for number in range(0,len(header_to_search)):
search = re.search(r"\(([^)]+)\)", header_to_search[number])
processes.append(search
print processes
但是这并没有在"<_sre.SRE_Match object at 0x10c1fw321>"
列表中显示"processes"
所列的流程列表。
我做错了什么?
我还没有进入舞台或检查processes
列表中的重复内容,但如果有任何建议,我将不胜感激,因为我不熟悉使用正则表达式。
答案 0 :(得分:1)
提醒re.search()
返回MatchObject;为了提取你想要的东西,你会想要使用match.group(1)
之类的东西,它会返回匹配的第一组,换句话说,就是你的正则表达式中()
捕获组内的标记。
请注意,在调用.group
之前,如果确实找到了匹配项,则应该检查一下,因为如果re.search
不匹配并且调用None
None.group
将返回{{1}}会引发错误。
要解决有关重复的次要问题,建议您使用set
。
答案 1 :(得分:1)
你可以提出:
import re
processes = ['\\\\TEST-PC\\Process(python)\\Operations/sec',
'\\\\TEST-PC\\Process(process#2)\\Operations/sec',
'\\\\TEST-PC\\Process(process#1)\\Operations/sec',
'\\\\TEST-PC\\Process(process)\\Operations/sec',
'\\\\TEST-PC\\Process(python)\\Thread Count',
'\\\\TEST-PC\\Process(process#2)\\Thread Count',
'\\\\TEST-PC\\Process(process#1)\\Thread Count',
'\\\\TEST-PC\\Process(process)\\Thread Count']
rx = re.compile(r'Process\(([^)]+)\)')
processes_filtered = []
for process in processes:
match = rx.search(process)
if match is not None:
if match.group(1) not in processes_filtered:
processes_filtered.append(match.group(1))
print processes_filtered
# ['python', 'process#2', 'process#1', 'process']
或者 - 甚至更短 - 使用列表理解:
rx = re.compile(r'Process\(([^)]+)\)')
processes_filtered = set([m.group(1) \
for process in processes \
for m in [rx.search(process)] if m])
答案 2 :(得分:0)
如果订单无关紧要,您可以这样做:
>>> import re
>>> tgt=['\\\\TEST-PC\\Process(python)\\Operations/sec',
... '\\\\TEST-PC\\Process(process#2)\\Operations/sec',
... '\\\\TEST-PC\\Process(process#1)\\Operations/sec',
... '\\\\TEST-PC\\Process(process)\\Operations/sec',
... '\\\\TEST-PC\\Process(python)\\Thread Count',
... '\\\\TEST-PC\\Process(process#2)\\Thread Count',
... '\\\\TEST-PC\\Process(process#1)\\Thread Count',
... '\\\\TEST-PC\\Process(process)\\Thread Count']
>>> {m.group(1) for m in re.finditer(r'^[^(]+\(([^)]+)\)', '\n'.join(tgt), flags=re.M)}
set(['python', 'process#2', 'process#1', 'process'])