我正在尝试处理
中的Linux输出以下是我在Linux上的输出:
machine01:/mnt/vlm/log-prod machine02:/mnt/machine01_vlm/log-prod Transferred 17:46:14 Idle
machine01:/mnt/vlm/log-test machine02:/mnt/machine01_vlm/log-test Transferred 17:46:14 Idle
machine01:/mnt/wndchl/- machine02:/mnt/machine01_wndchl/machine01_wndchl_machine01_wndchl Transferred 18:36:10 Idle
machine01:/mnt/wndchl/prod machine02:/mnt/machine01_wndchl/prod Transferred 18:36:10 Idle
machine01:/mnt/wndchl/test machine02:/mnt/machine01_wndchl/test Transferred 18:36:10 Idle
machine01:/mnt/iso/Archive machine02:/mnt/iso/Archive Transferred 19:06:10 Idle
machine01:/mnt/iso/Ready To Transfer machine02:/mnt/iso/ReadyxToxTransfer Transferred 19:06:10 Idle
machine01:/mnt/iso/- machine02:/mnt/iso/iso_machine01_iso Transferred 19:06:10 Idle
machine01:/mnt/it/SCCM machine02:/mnt/it/SCCM Transferred 19:25:51 Idle
machine01:/mnt/it/Windows machine02:/mnt/it/Windows Transferred 19:25:51 Idle
machine01:/mnt/it/- machine02:/mnt/it/machine01_it_machine01_it Transferred 19:25:51 Idle
machine01:/mnt/it/dcs machine02:/mnt/it/dcs Transferred 19:25:51 Idle
machine01:/mnt/it/hds_perf_logs machine02:/mnt/it/hds_perf_logs Transferred 19:25:51 Idle
machine01:/mnt/legalhold/LegalHold machine02:/mnt/legalhold/LegalHold Transferred 18:46:06 Idle
machine01:/mnt/legalhold/- machine02:/mnt/legalhold/legalhold_machine01_legalhold Transferred 18:46:06 Idle
这是我的python脚本
for x in f.readlines():
output_data = x.split()
#Define variable
source_path = output_data[0]
dest_path = output_data[1]
print "working on....",source_path
relationship = output_data[2]
#We are only interested with hour,split it out!
buffer_time = output_data[3].split(":",1)
relationship_status = output_data[4]
#Get destination nas hostname
dest_nas = output_data[1].split(":",1)
dest_nas_hostname = dest_nas[0]
#Get the exact hour number and convert it into int
extracted_hour = int(buffer_time[0])
if relationship_status == "Idle":
if extracted_hour > max_tolerate_hour:
print "Source path : ",source_path
print "Destination path : ",dest_path
print "Max threshold(hours): ",max_tolerate_hour
print "Idle (hours) : ",extracted_hour
print "======================================================================"
else:
pass
print "Scan completed!"
一切看起来都不错,但是当第7行的空间“准备转移”搞砸了剧本时,它就会破裂......我可以试试看&除了,但它没有解决问题。
请让我知道我还能做些什么?
答案 0 :(得分:0)
您可以根据正则表达式进行拆分。这个正则表达式匹配多个空格:
>>> import re
>>> s = "machine01:/mnt/iso/Ready To Transfer machine02:/mnt/iso/ReadyxToxTransfer Transferred 19:06:10 Idle"
>>> re.split(' +', s)
['machine01:/mnt/iso/Ready To Transfer', 'machine02:/mnt/iso/ReadyxToxTransfer', 'Transferred', '19:06:10', 'Idle']
如果你的文件名有多个空格,这仍然会破坏。我建议使用更加量身定制的正则表达式:
>>> parts = re.search(r'(machine.*)(machine.*)(\s\w+)\s+([0-9:]+)\s+(\w+)', s).groups()
>>> [p.strip() for p in parts]
['machine01:/mnt/iso/Ready To Transfer', 'machine02:/mnt/iso/ReadyxToxTransfer', 'Transferred', '19:06:10', 'Idle']
编辑:正则表达式打破了“machine02:/ mnt / machine01_vlm / log-prod”,试试这个
>>> for line in input_lines.split('\n'):
... parts = re.search(r'(^machine\d\d:.*)(machine\d\d:.*)(\s\w+)\s+([0-9:]+)\s+(\w+)', line).groups()
... print [p.strip() for p in parts]
...
['machine01:/mnt/vlm/log-prod', 'machine02:/mnt/machine01_vlm/log-prod', 'Transferred', '17:46:14', 'Idle']
['machine01:/mnt/vlm/log-test', 'machine02:/mnt/machine01_vlm/log-test', 'Transferred', '17:46:14', 'Idle']
['machine01:/mnt/wndchl/-', 'machine02:/mnt/machine01_wndchl/machine01_wndchl_machine01_wndchl', 'Transferred', '18:36:10', 'Idle']
['machine01:/mnt/wndchl/prod', 'machine02:/mnt/machine01_wndchl/prod', 'Transferred', '18:36:10', 'Idle']
['machine01:/mnt/wndchl/test', 'machine02:/mnt/machine01_wndchl/test', 'Transferred', '18:36:10', 'Idle']
['machine01:/mnt/iso/Archive', 'machine02:/mnt/iso/Archive', 'Transferred', '19:06:10', 'Idle']
['machine01:/mnt/iso/Ready To Transfer', 'machine02:/mnt/iso/ReadyxToxTransfer', 'Transferred', '19:06:10', 'Idle']
['machine01:/mnt/iso/-', 'machine02:/mnt/iso/iso_machine01_iso', 'Transferred', '19:06:10', 'Idle']
['machine01:/mnt/it/SCCM', 'machine02:/mnt/it/SCCM', 'Transferred', '19:25:51', 'Idle']
['machine01:/mnt/it/Windows', 'machine02:/mnt/it/Windows', 'Transferred', '19:25:51', 'Idle']
['machine01:/mnt/it/-', 'machine02:/mnt/it/machine01_it_machine01_it', 'Transferred', '19:25:51', 'Idle']
['machine01:/mnt/it/dcs', 'machine02:/mnt/it/dcs', 'Transferred', '19:25:51', 'Idle']
['machine01:/mnt/it/hds_perf_logs', 'machine02:/mnt/it/hds_perf_logs', 'Transferred', '19:25:51', 'Idle']
['machine01:/mnt/legalhold/LegalHold', 'machine02:/mnt/legalhold/LegalHold', 'Transferred', '18:46:06', 'Idle']
['machine01:/mnt/legalhold/-', 'machine02:/mnt/legalhold/legalhold_machine01_legalhold', 'Transferred', '18:46:06', 'Idle']
以下是Python re module文档
的链接用于试验正则表达式的好工具是https://www.debuggex.com/
答案 1 :(得分:0)
import re
LOG_FMT = re.compile('(\w+):(.*?)\s+(\w+):(.*?)\s+(\w+)\s+(\d+):(\d+):(\d+)\s+(\w+)')
max_tolerate_hours = 19.2
def main():
with open('my.log') as inf:
for row in inf:
match = LOG_FMT.match(row)
if match is not None:
src_machine, src_path, dest_machine, dest_path, rel, hh, mm, ss, status = match.groups()
hh, mm, ss = int(hh), int(mm), int(ss)
hours = hh + (mm / 60.) + (ss / 3600.)
if status == 'Idle' and hours > max_tolerate_hours:
print('Source path : {}'.format(src_path))
print('Destination path : {}'.format(dest_path))
print('Max threshold (h) : {:0.2f}'.format(max_tolerate_hours))
print('Idle (h) : {:0.2f}'.format(hours))
print('=========================================================')
print('Scan completed!')
if __name__=="__main__":
main()
针对您的给定数据返回
Source path : /mnt/it/SCCM
Destination path : /mnt/it/SCCM
Max threshold (h) : 19.10
Idle (h) : 19.43
=========================================================
Source path : /mnt/it/Windows
Destination path : /mnt/it/Windows
Max threshold (h) : 19.10
Idle (h) : 19.43
=========================================================
Source path : /mnt/it/-
Destination path : /mnt/it/machine01_it_machine01_it
Max threshold (h) : 19.10
Idle (h) : 19.43
=========================================================
Source path : /mnt/it/dcs
Destination path : /mnt/it/dcs
Max threshold (h) : 19.10
Idle (h) : 19.43
=========================================================
Source path : /mnt/it/hds_perf_logs
Destination path : /mnt/it/hds_perf_logs
Max threshold (h) : 19.10
Idle (h) : 19.43
=========================================================
Scan completed!