我有一个日志文件。我应该简化并将其导出为excel文件。此日志文件中有特定文本。在这篇文章之后我需要数据。如何读取和导出此日志文件?
Root node processing (before b&c):
Real time = 4.89 sec. (2902.79 ticks)
Parallel b&c, 4 threads:
Real time = 96.05 sec. (38798.86 ticks)
Sync time (average) = 5.63 sec.
Wait time (average) = 0.01 sec.
------------
Total (root+branch&cut) = 100.94 sec. (41701.65 ticks)
Solution pool: 8 solutions saved.
MIP - Integer optimal solution: Objective = 1.1401550956e+03
Solution time = 100.94 sec. Iterations = 501135 Nodes = 2819
Deterministic time = 41701.69 ticks (413.12 ticks/sec)
Incumbent solution
Variable Name Solution Value
x4_5_14 1.000000
x4_5_24 1.000000
x4_5_34 1.000000
x4_5_52 1.000000
x4_5_82 1.000000
x4_5_106 1.000000
x4_5_118 1.000000
x4_5_142 1.000000
x4_5_154 1.000000
x4_6_19 1.000000
x4_6_29 1.000000
x4_6_40 1.000000
x4_6_58 1.000000
x4_6_88 1.000000
x4_6_112 1.000000
x4_6_124 1.000000
x4_6_148 1.000000
x4_6_160 1.000000
x5_5_9 1.000000
x5_5_19 1.000000
x5_5_29 1.000000
x5_5_46 1.000000
x5_5_58 1.000000
x5_5_70 1.000000
x5_5_94 1.000000
x5_5_130 1.000000
x5_5_142 1.000000
x5_5_154 1.000000
x5_5_166 1.000000
x5_5_178 1.000000
x5_6_14 1.000000
x5_6_24 1.000000
x5_6_34 1.000000
x5_6_52 1.000000
x5_6_64 1.000000
x5_6_76 1.000000
x5_6_100 1.000000
x5_6_136 1.000000
x5_6_148 1.000000
x5_6_160 1.000000
x5_6_172 1.000000
x5_6_184 1.000000
x9_5_4 1.000000
x9_5_14 1.000000
x9_5_29 1.000000
x9_5_40 1.000000
x9_5_64 1.000000
x9_5_76 1.000000
x9_5_88 1.000000
x9_5_100 1.000000
x9_5_112 1.000000
x9_5_124 1.000000
x9_5_136 1.000000
x9_5_148 1.000000
x9_5_160 1.000000
x9_5_172 1.000000
x9_6_9 1.000000
x9_6_19 1.000000
x9_6_34 1.000000
x9_6_46 1.000000
x9_6_70 1.000000
x9_6_82 1.000000
x9_6_94 1.000000
x9_6_106 1.000000
x9_6_118 1.000000
x9_6_130 1.000000
x9_6_142 1.000000
x9_6_154 1.000000
x9_6_166 1.000000
x9_6_178 1.000000
x11_1_12 1.000000
x11_1_24 1.000000
x11_1_40 1.000000
x11_1_60 1.000000
x11_1_83 1.000000
x11_1_105 1.000000
x11_1_128 1.000000
x11_1_140 1.000000
x11_1_154 1.000000
x11_2_19 1.000000
x11_2_32 1.000000
x11_2_52 1.000000
x11_2_72 1.000000
x11_2_94 1.000000
x11_2_116 1.000000
x11_2_135 1.000000
x11_2_148 1.000000
x11_2_162 1.000000
x17_1_30 1.000000
x17_1_136 1.000000
x17_2_37 1.000000
x17_2_142 1.000000
x18_1_10 1.000000
x18_1_23 1.000000
x18_1_36 1.000000
x18_1_56 1.000000
x18_1_76 1.000000
x18_1_99 1.000000
x18_1_121 1.000000
x18_1_137 1.000000
x18_1_149 1.000000
x18_1_184 1.000000
x18_1_196 1.000000
x18_1_208 1.000000
x18_2_17 1.000000
x18_2_30 1.000000
x18_2_48 1.000000
x18_2_68 1.000000
x18_2_88 1.000000
x18_2_110 1.000000
x18_2_131 1.000000
x18_2_143 1.000000
x18_2_156 1.000000
x18_2_190 1.000000
x18_2_202 1.000000
x18_2_214 1.000000
x23_1_17 1.000000
x23_1_30 1.000000
x23_1_153 1.000000
x23_2_24 1.000000
x23_2_37 1.000000
x23_2_159 1.000000
x27_1_7 1.000000
x27_1_19 1.000000
x27_1_32 1.000000
x27_1_48 1.000000
x27_1_68 1.000000
x27_1_89 1.000000
x27_1_131 1.000000
x27_1_143 1.000000
x27_1_157 1.000000
x27_1_170 1.000000
x27_1_202 1.000000
x27_2_14 1.000000
x27_2_26 1.000000
x27_2_40 1.000000
x27_2_60 1.000000
x27_2_80 1.000000
x27_2_100 1.000000
x27_2_137 1.000000
x27_2_150 1.000000
x27_2_165 1.000000
x27_2_176 1.000000
x27_2_208 1.000000
x32_1_19 1.000000
x32_1_33 1.000000
x32_1_137 1.000000
x32_1_153 1.000000
x32_2_26 1.000000
macost52 8.710800
macost60 54.797800
macost 599.535600
dricost4 16.339460
dricost5 21.878260
dricost9 25.201540
dricost11 21.324380
dricost17 3.877160
dricost18 26.309300
dricost23 6.369620
dricost27 24.924600
dricost32 8.862080
dricost40 22.432140
dricost41 2.492460
dricost43 21.324380
dricost45 9.969840
dricost46 13.293120
dricost47 11.908420
dricost52 3.877160
dricost60 23.539900
dricost 263.923820
tmil4 115.290000
tmil5 153.720000
tmil9 179.340000
tmil11 138.150000
tmil17 30.700000
tmil18 184.200000
tmil23 46.050000
tmil27 168.850000
tmil32 61.400000
tmil40 153.500000
tmil41 15.350000
tmil43 138.150000
tmil45 61.400000
tmil46 92.100000
tmil47 76.750000
tmil52 25.620000
tmil60 168.850000
tmil 1809.420000
ttime4 295.000000
ttime5 395.000000
ttime9 455.000000
ttime11 385.000000
ttime17 70.000000
ttime18 475.000000
ttime23 115.000000
ttime27 450.000000
ttime32 160.000000
ttime40 405.000000
ttime41 45.000000
ttime43 385.000000
ttime45 180.000000
ttime46 240.000000
ttime47 215.000000
ttime52 70.000000
ttime60 425.000000
ttime 4765.000000
tboar 10275.000000
nbus 34.000000
All other variables matching '*' are 0.
我需要“MIP-Integer 最佳解决方案”行之后的数据。我想在“现有解决方案”文本下方提取目标、解决方案时间、迭代次数、节点、确定性时间和数据。
我试过了。
import pandas as pd
import itertools
import os
x = pd.read_csv(os.path.expanduser('G1/Cplex_Cng12/RGroup1_cng12.log'), usecols=[0])
print(x[135:])
但是所需文本上部的行数不一样。所以我不能使用skiprows 功能。我需要简化这一点,只使用“现有解决方案”文本下的数据。并且还需要获得目标、求解时间、迭代和确定性时间值。他们在同一条线上。我需要将这些值分开。
答案 0 :(得分:1)
您应该使用可靠的分隔符解析文件。这里我选择了 MIP - Integer optimal solution
、\n
、Incumbent solution
和 All other variables matching
作为分隔符。如果这些代码不可靠,您可能需要修改这些代码。
完整代码:
import re, io
start_collecting_annotations = False
start_collecting_data = False
annotations_lines = []
data_lines = []
with open('/tmp/log.txt') as f:
while True:
line = f.readline()
if line == '': # if no more lines to read, stop
break
if line.startswith('MIP - Integer optimal solution'):
start_collecting_annotations = True
if line.startswith('Incumbent solution'):
start_collecting_data = True
if start_collecting_annotations: # here we collect the annotations
if line == '\n':
start_collecting_annotations = False
else:
annotations_lines.append(line)
if start_collecting_data: # here we collect the data
if line.startswith('All other variables matching'):
break
else:
data_lines.append(line)
annotations = pd.Series(dict([re.split('\s+=\s+', i)
for i in re.findall(r'(?:[^\s]+ )?[^\s]+\s+=\s+[^\s]+',
' '.join(annotations_lines))
])).astype(float)
df = pd.read_csv(io.StringIO(''.join(data_lines[1:])), sep='\s\s+', index_col=[0])
输出:
>>> annotations
Objective 1140.155096
Solution time 100.940000
Iterations 501135.000000
Nodes 2819.000000
Deterministic time 41701.690000
dtype: float64
>>> df.head()
Solution Value
Variable Name
x4_5_14 1.0
x4_5_24 1.0
x4_5_34 1.0
x4_5_52 1.0
x4_5_82 1.0
答案 1 :(得分:0)
这是一种“野蛮”的方式来做到这一点,很多人不会抱怨,但嘿,它有效:
data = pd.read_csv("test.txt", sep='\t')
for i in range(len(data)):
if data[f'{list(data.columns)[0]}'][i][0:3] == 'MIP':
Objective = float(data[f'{list(data.columns)[0]}'][i][46:62])
Solution_time = float(data[f'{list(data.columns)[0]}'][i+1][17:23])
Iteration = int(data[f'{list(data.columns)[0]}'][i+1][43:49])
Nodes = int(data[f'{list(data.columns)[0]}'][i+1][59:64])
Deterministic_time = float(data[f'{list(data.columns)[0]}'][i+2][21:29])
break
print(Objective, Solution_time, Iteration, Nodes, Deterministic_time)
test.txt 是你上面的数据,我只是把它复制成一个txt。
答案 2 :(得分:0)
我以另一种方式创建了一个解决方案,我创建了一个 .txt 文件,其中包含您提到的数据。我将它上传到python,然后创建excel,然后读取它并创建两个数据框,一个用于头部数据,第二个用于尾部。我知道这不是最有效的方法,但我仍在学习:)
import xlwt
import xlrd
book = xlwt.Workbook()
ws = book.add_sheet('First Sheet')
f = open('tekst.txt', 'r+')
data = f.readlines()
for i in range(len(data)):
row = data[i].split()
for j in range(len(row)):
ws.write(i, j, row[j])
#Creation Excel
book.save('Excelfile' + '.xls')
f.close()
#Write Excel and modify data
df = pd.read_excel('Excelfile.xls')
df = df[['Root', 'node', 'processing', '(before', 'b&c):']]
df = df[10:]
df = df.reset_index(drop=True)
df = df.rename(columns={df.columns[0]: 'Col_1',df.columns[1]: 'Col_2',df.columns[2]: 'Col_3',df.columns[3]: 'Col_4',df.columns[4]: 'Col_5'})
df_head = df[0:3].reset_index(drop=True)
df_bottom = df[7:].reset_index(drop=True)
df_bottom = df_bottom[['Col_1','Col_2']]
df.rename(columns={df_bottom.columns[0]: 'Variable Name',df_bottom.columns[1]: 'Solution Value'})
输出是这样的:
df_bottom
Col_1 Col_2
0 x4_5_14 1.000000
1 x4_5_24 1.000000
2 x4_5_34 1.000000
3 x4_5_52 1.000000
df_head
Col_1 Col_2 Col_3 Col_4 Col_5
0 MIP - Integer optimal solution:
1 Solution time = 100.94 sec.
2 Deterministic time = 41701.69 ticks
希望能帮到你