我一直在研究一些代码,这些代码读取制表符分隔的CSV文件,该文件代表一系列进程及其开始时间和持续时间,并使用pandas为其创建数据帧。然后,我需要应用简化的循环调度形式来查找进程的周转时间,并从用户输入中获取时间片。
到目前为止,我能够读取CSV文件,标记并正确排序。但是,当尝试构造循环以迭代行以查找每个进程时,完成时间,我卡住了。
到目前为止的代码如下:
# round robin
def rr():
docname = (sys.argv[1])
method = (sys.argv[2])
# creates a variable from the user input to define timeslice
timeslice = int(re.search(r'\d+', method).group())
# use pandas to create a 2-d data frame from tab delimited file, set column 0 (process names) to string, set column
# 1 & 2 (start time and duration, respectively) to integers
d = pd.read_csv(docname, delimiter="\t", header=None, dtype={'0': str, '1': np.int32, '2': np.int32})
# sort d into d1 by values of start times[1], ascending
d1 = d.sort_values(by=1)
# Create a 4th column, set to 0, for the Completion time
d1[3] = 0
# change column names
d1.columns = ['Process', 'Start', 'Duration', 'Completion']
# intialize counter
counter = 0
# if any values in column 'Duration' are above 0, continue the loop
while (d1['Duration']).any() > 0:
for index, row in d1.iterrows():
# if value in column 'Duration' > the timeslice, add the value of the timeslice to the current counter,
# subtract it from the the current value in column 'Duration'
if row.Duration > timeslice:
counter += timeslice
row.Duration -= timeslice
print(index, row.Duration)
# if value in column "Duration" <= the timeslice, add the current value of the row:Duration to the counter
# subtract the Duration from itself, to make it 0
# set row:Completion to the current counter, which is the completion time for the process
elif row.Duration <= timeslice and row.Duration != 0:
counter += row.Duration
row.Duration -= row.Duration
row.Completion = counter
print(index, row.Duration)
# otherwise, if the value in Duration is already 0, print that index, with the "Done" indicator
else:
print(index, "Done")
鉴于示例CSV文件,d1
看起来像
Process Start Duration Completion
3 p4 0 280 0
0 p1 5 140 0
1 p2 14 75 0
2 p3 36 320 0
5 p6 40 0 0
4 p5 67 125 0
当我用timeslice = 70
运行我的代码时,我得到一个无限循环:
3 210
0 70
1 5
2 250
5 Done
4 55
3 210
0 70
1 5
2 250
5 Done
4 55
这似乎是正确迭代循环一次,然后无限重复。但是,print(d1['Completion'])
给出了所有0的值,这意味着它不会将正确的counter
值分配给d1['Completion']
。
理想情况下,Completion
值将填写相应的时间,给定timeslice=70
,如:
Process Start Duration Completion
3 p4 0 280 830
0 p1 5 140 490
1 p2 14 75 495
2 p3 36 320 940
5 p6 40 0 280
4 p5 67 125 620
然后我可以用它来查找平均周转时间。然而,出于某种原因,我的循环似乎迭代一次然后无休止地重复。当我尝试切换while
和for
语句的顺序时,它将重复迭代每一行,直到达到0,同时给出错误的完成时间。
提前致谢。
答案 0 :(得分:0)
我修改了你的代码并且它有效。你实际上无法用修改后的值覆盖原始值,所以循环不会结束。
while (d1['Duration']).any() > 0:
for index, row in d1.iterrows():
# if value in column 'Duration' > the timeslice, add the value of the timeslice to the current counter,
# subtract it from the the current value in column 'Duration'
if row.Duration > timeslice:
counter += timeslice
#row.Duration -= timeslice
# !!!LOOK HERE!!!
d1['Duration'][index] -= timeslice
print(index, row.Duration)
# if value in column "Duration" <= the timeslice, add the current value of the row:Duration to the counter
# subtract the Duration from itself, to make it 0
# set row:Completion to the current counter, which is the completion time for the process
elif row.Duration <= timeslice and row.Duration != 0:
counter += row.Duration
#row.Duration -= row.Duration
#row.Completion = counter
# !!!LOOK HERE!!!
d1['Duration'][index] = 0
d1['Completion'][index] = counter
print(index, row.Duration)
# otherwise, if the value in Duration is already 0, print that index, with the "Done" indicator
else:
print(index, "Done")
顺便说一下,我想你可能想要模拟进程调度算法。在这种情况下,您必须考虑“开始”,因为并非每个流程都会同时启动。
(你理想的表格有点不对。)