我从48个文件中的超过12,000行数据开始,在创建了4个模块和16个python类之后,程序到目前为止通过数据识别出每个文件中有4行的3个.csv文件(总共48行) )!我必须创建3个散点图。
我已经绘制了16行,分为4个数据集。我需要在散点图上为每个数据集添加趋势线。共有4条趋势线,每条趋势线有4个数据点。下面是我在这张图上需要四条趋势线的3个散点图之一的副本:第一,第二,第三和第四。下面是执行类的一个例子:
import matplotlib.pyplot as plt
import numpy as np
# Created as part of the DineroIV Simulation reporting module
# author:GeoWade
class SplitFormatDc1():
def openDc1Csv(self):
lstdc1csv = []
# open the file and read into the program
filadc1csv = open('Dc1din-Dcaches-totalsXY.csv', 'r')
for vcs1dc in filadc1csv:
vcs1dcOne = vcs1dc.strip('\n')
vcs1dcTwo = vcs1dcOne[-6:]
lstdc1csv.append(vcs1dcTwo)
filadc1csv.close()
return lstdc1csv
def split_Dc1Csv(self):
# create an object to assign the function that was abstracted
lstdc1split = self.openDc1Csv()
# convert every data point into a float
four8k = [float(x) for x in lstdc1split]
# declare 4 lists for the soon to be data sets
fourk = []
eightk = []
sixtnk = []
thtwok = []
lineLst = [32.00, 64.00, 128.00, 256.00]
numb = 0
# create 4 seperate data sets, 1 correlating to
# each cache size and line size
for t in four8k:
if (numb < 4):
tOne = 1.00 - t
fourk.append(tOne)
elif (numb >= 4 and numb < 8):
tTwo = 1.00 - t
eightk.append(tTwo)
elif (numb >= 8 and numb < 12):
tThree = 1.00 - t
sixtnk.append(tThree)
else:
tFour = 1.00 - t
thtwok.append(tFour)
numb += 1
x = np.arange(0.8500, 1.000)
y = np.arange(0.8500, 256.00)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.scatter(lineLst,fourk, s=10, c='b', marker="s", label='4Kb Cache')
# creat the polyfit
z1 = np.polyfit(lineLst, fourk, 1)
# create the poly1d
p1 = np.poly1d(z1)
# plot the line fit
plt.plot(lineLst,p1(lineLst),"b--")
# plot the scatter for the 8kb cache
ax1.scatter(lineLst, eightk, s=10, c='r', marker="o", label='8Kb Cache')
# creat the polyfit
z2 = np.polyfit(lineLst, eightk, 1)
# create the poly1d
p2 = np.poly1d(z2)
# plot the line fit
plt.plot(lineLst,p2(lineLst),"r--")
# plot the scatter for the 16kb cache
ax1.scatter(lineLst, sixtnk, s=10, c='g', marker="s", label='16Kb Cache')
# create the polyfit
z3 = np.polyfit(lineLst, sixtnk, 1)
# create the poly1d
p3 = np.poly1d(z3)
# plot the line fit
plt.plot(lineLst,p3(lineLst),"g--")
# plot the scatter for the 32kb cache
ax1.scatter(lineLst, thtwok, s=10, c='y', marker="o", label='32Kb Cache')
# create the polyfit
z4 = np.polyfit(lineLst, thtwok, 1)
# create the poly1d
p4 = np.poly1d(z4)
# plot the line fit
plt.plot(lineLst,p4(lineLst),"y--")
# add a legend
plt.legend(loc='lower left');
fig1 = plt.gcf()
# save the image
fig1.savefig('cc1din1.png', dpi=75)
这是从Dc1din-Dcaches-totalsXY.csv读入第一个函数的数据。第一列已经过时,我设计了这个类,因为16行被迭代分成4个列表;并且第一列仅标识第3列应该在哪个列表中。
4k,32,0.0740
4k,64,0.0816
4k,128,0.1078
4k,256,0.1391
8k,32,0.0454
8k,64,0.0496
8k,128,0.0615
8k,256,0.0795
16k,32,0.0252
16k,64,0.0249
16k,128,0.0276
16k,256,0.0369
32k,32,0.0138
32k,64,0.0115
32k,128,0.0118
32k,256,0.0154