我正在尝试绘制我的数据点的XYZ图。每个数据都有一个与“ 1”错误或“ 0”成功相关的值。我的数据在此link中。
第一次尝试,我使用了import pickle
from importlib import import_module
from io import BytesIO
# example using pure python
class NoPickle:
def __init__(self, name):
# emulating a function set of attributes needed to pickle
self.__module__ = __name__
self.__qualname__ = name
def __reduce__(self):
# cannot pickle this object
raise Exception
my_object = NoPickle('my_object')
# pickle.dumps(obj) # error!
# use persistent_id/load to help dump/load cython functions
class CustomPickler(pickle.Pickler):
def persistent_id(self, obj):
if isinstance(obj, NoPickle):
# replace with NoPickle with type(module.func) to get the correct type
# alternatively you might want to include a simple cython function
# in the same module to make it easier to get the write type.
return "CythonFunc" , obj.__module__, obj.__qualname__
else:
# else return None to pickle the object as normal
return None
class CustomUnpickler(pickle.Unpickler):
def persistent_load(self, pid):
if pid[0] == "CythonFunc":
_, mod_name, func_name = pid
return getattr(import_module(mod_name), func_name)
else:
raise pickle.UnpicklingError('unsupported pid')
bytes_ = BytesIO()
CustomPickler(bytes_).dump(my_object)
bytes_.seek(0)
obj = CustomUnpickler(bytes_).load()
assert obj is my_object
splot
该图的问题在于不可能很好地区分点之间的相对位置及其在空间中的位置。为了解决该问题,先决问题How to plot (x,y,z) points showing their density提供了一种基于点密度降低颜色的解决方案。
我想通过将每个点的类别(错误或成功)包括在为它们着色的标准中来扩展该问题。即,将两种类型的点都考虑在内以进行着色,并绘制所有类别的点。
我没有精确的着色方法,但是有一个主意可能是使用类似splot "data_all.dat" u 1:2:3:4 w points ls 1 palette title "P_{error}"
的函数,其中(1 - a) x (num_success_in_delta) + (a x num_errors_in_delta)
是实数[0,1],权重错误数,而成功指向a
。在错误样本和成功样本之间插入XYZ点可能是另一种方式,但是我不知道如何在Gnuplot中解决它。
为改善点的密度信息,可能的话,可以弄清等值线投影或XY平面上的2D密度图。
我希望制作一个delta
文件作为LaTeX中的图形,以提供Gnuplot或pgfplots的质量。
致谢
答案 0 :(得分:2)
以下代码是此处解决方案(How to plot (x,y,z) points showing their density)的略微修改。错误和成功发生的次数计入一定量的(2*DeltaX x 2*DeltaY x 2*DeltaZ)
中。
结果存储在文件中,因此,您只需要计数一次(您的1万行数据在我的旧PC上花费了大约1h15min)。也许可以提高gnuplot代码的效率。好吧,那么您采用下面的第二个代码并快速绘制结果文件。我不确定哪种着色是最好的。您可以使用调色板。举个例子,下面的代码使用红色(-1)表示最大错误计数(即密度),使用绿色(+1)表示最大成功密度。我希望这可以作为进一步优化的起点。
### 3D density plot
reset session
FILE = "data_all.dat"
DeltaX = 0.5 # half boxwidth
DeltaY = 0.5 # half boxlength
DeltaZ = 0.5 # half boxheight
TimeStart = time(0.0)
# put the datafile/dataset into arrays
stats FILE nooutput
RowCount = STATS_records
array ColX[RowCount]
array ColY[RowCount]
array ColZ[RowCount]
array ColR[RowCount] # Result 0=Success, 1=Error
array ColCE[RowCount] # Counts Error
array ColCS[RowCount] # Counts Success
do for [i=1:RowCount] {
set table $Dummy
plot FILE u (ColX[$0+1]=$1,0):(ColY[$0+1]=$2,0):(ColZ[$0+1]=$3,0):(ColR[$0+1]=$4,0) with table
unset table
}
# look at each datapoint and its sourrounding
Error = 1
Success = 0
do for [i=1:RowCount] {
print sprintf("Datapoint %g of %g",i,RowCount)
x0 = ColX[i]
y0 = ColY[i]
z0 = ColZ[i]
# count the datapoints with distances <Delta around the datapoint of interest
set table $ErrorOccurrences
plot FILE u ((abs(x0-$1)<DeltaX) & (abs(y0-$2)<DeltaY) & (abs(z0-$3)<DeltaZ) & ($4==Error)? 1 : 0):(1) smooth frequency
unset table
set table $SuccessOccurrences
plot FILE u ((abs(x0-$1)<DeltaX) & (abs(y0-$2)<DeltaY) & (abs(z0-$3)<DeltaZ) & ($4==Success) ? 1 : 0):(1) smooth frequency
unset table
# extract the number from $Occurrences which will be used to color the datapoint
set table $ErrorDummy
plot $ErrorOccurrences u (c0=$2,0):($0) every ::1::1 with table
unset table
ColCE[i] = c0
set table $SuccessDummy
plot $SuccessOccurrences u (c0=$2,0):($0) every ::1::1 with table
unset table
ColCS[i] = c0
}
# put the arrays into a dataset again
set print $Data
do for [i=1:RowCount] {
print sprintf("%g\t%g\t%g\t%g\t%g\t%g",ColX[i],ColY[i],ColZ[i],ColR[i],ColCE[i],ColCS[i])
}
set print
stats $Data u 5:6 nooutput
CEmax = STATS_max_x
CSmax = STATS_max_y
print CEmax, CSmax
TimeEnd = time(0.0)
print sprintf("Duration: %.3f sec",TimeEnd-TimeStart)
set print "data_all_color.dat"
print $Data
set print
set palette defined (-1 "red", 0 "white", 1 "green")
splot $Data u 1:2:3:($4==1? -$5/CEmax : $6/CSmax) w p ps 0.5 pt 7 lc palette z notitle
### end of code
一旦计算了出现次数,只需绘制新的数据文件并使用调色板即可。
### 3D density plot
reset session
FILE = "data_all_color.dat"
stats FILE u 5:6 nooutput # get maxium count from Error and Success
CEmax = STATS_max_x
CSmax = STATS_max_y
print CEmax, CSmax
set ztics 0.2
set view 50,70
set palette defined (-1 "red", 0 "white", 1 "green")
splot FILE u 1:2:3:($4==1? -$5/CEmax : $6/CSmax) w p ps 0.2 pt 7 lc palette z notitle
### end of code
例如您的数据:
添加:
当前,$5
列和$6
列分别包含一定量的Error和Success的绝对出现次数。如果您想要一个错误概率(但是我不是统计学家),但是我的猜测是,您必须将错误$5
的发生次数除以该卷中事件$5+$6
的总数。 / p>
splot FILE u 1:2:3:($5/($5+$6)) w p ps 0.2 pt 7 lc palette z notitle
另一个示例的调色板是
set palette rgb 33,13,10
通常,对于调色板,请咨询help palette
,您会发现很多细节。