如何根据点的类别制作3D密度图

时间:2018-12-14 12:19:10

标签: 3d gnuplot density-plot

我正在尝试绘制我的数据点的XYZ图。每个数据都有一个与“ 1”错误或“ 0”成功相关的值。我的数据在此link中。 第一次尝试,我使用了import pickle from importlib import import_module from io import BytesIO # example using pure python class NoPickle: def __init__(self, name): # emulating a function set of attributes needed to pickle self.__module__ = __name__ self.__qualname__ = name def __reduce__(self): # cannot pickle this object raise Exception my_object = NoPickle('my_object') # pickle.dumps(obj) # error! # use persistent_id/load to help dump/load cython functions class CustomPickler(pickle.Pickler): def persistent_id(self, obj): if isinstance(obj, NoPickle): # replace with NoPickle with type(module.func) to get the correct type # alternatively you might want to include a simple cython function # in the same module to make it easier to get the write type. return "CythonFunc" , obj.__module__, obj.__qualname__ else: # else return None to pickle the object as normal return None class CustomUnpickler(pickle.Unpickler): def persistent_load(self, pid): if pid[0] == "CythonFunc": _, mod_name, func_name = pid return getattr(import_module(mod_name), func_name) else: raise pickle.UnpicklingError('unsupported pid') bytes_ = BytesIO() CustomPickler(bytes_).dump(my_object) bytes_.seek(0) obj = CustomUnpickler(bytes_).load() assert obj is my_object

splot

enter image description here

该图的问题在于不可能很好地区分点之间的相对位置及其在空间中的位置。为了解决该问题,先决问题How to plot (x,y,z) points showing their density提供了一种基于点密度降低颜色的解决方案。

我想通过将每个点的类别(错误或成功)包括在为它们着色的标准中来扩展该问题。即,将两种类型的点都考虑在内以进行着色,并绘制所有类别的点。

我没有精确的着色方法,但是有一个主意可能是使用类似splot "data_all.dat" u 1:2:3:4 w points ls 1 palette title "P_{error}" 的函数,其中(1 - a) x (num_success_in_delta) + (a x num_errors_in_delta)是实数[0,1],权重错误数,而成功指向a。在错误样本和成功样本之间插入XYZ点可能是另一种方式,但是我不知道如何在Gnuplot中解决它。

为改善点的密度信息,可能的话,可以弄清等值线投影或XY平面上的2D密度图。 我希望制作一个delta文件作为LaTeX中的图形,以提供Gnuplot或pgfplots的质量。

致谢

1 个答案:

答案 0 :(得分:2)

以下代码是此处解决方案(How to plot (x,y,z) points showing their density)的略微修改。错误和成功发生的次数计入一定量的(2*DeltaX x 2*DeltaY x 2*DeltaZ)中。 结果存储在文件中,因此,您只需要计数一次(您的1万行数据在我的旧PC上花费了大约1h15min)。也许可以提高gnuplot代码的效率。好吧,那么您采用下面的第二个代码并快速绘制结果文件。我不确定哪种着色是最好的。您可以使用调色板。举个例子,下面的代码使用红色(-1)表示最大错误计数(即密度),使用绿色(+1)表示最大成功密度。我希望这可以作为进一步优化的起点。

### 3D density plot
reset session

FILE = "data_all.dat"

DeltaX = 0.5  # half boxwidth
DeltaY = 0.5  # half boxlength
DeltaZ = 0.5  # half boxheight

TimeStart = time(0.0)

# put the datafile/dataset into arrays
stats FILE nooutput
RowCount = STATS_records
array ColX[RowCount]
array ColY[RowCount]
array ColZ[RowCount]
array ColR[RowCount]   # Result 0=Success, 1=Error
array ColCE[RowCount]  # Counts Error
array ColCS[RowCount]  # Counts Success
do for [i=1:RowCount] {
set table $Dummy
    plot FILE u (ColX[$0+1]=$1,0):(ColY[$0+1]=$2,0):(ColZ[$0+1]=$3,0):(ColR[$0+1]=$4,0) with table
unset table
}

# look at each datapoint and its sourrounding
Error = 1
Success = 0
do for [i=1:RowCount] {
    print sprintf("Datapoint %g of %g",i,RowCount)
    x0 = ColX[i]
    y0 = ColY[i]
    z0 = ColZ[i]
    # count the datapoints with distances <Delta around the datapoint of interest
    set table $ErrorOccurrences
        plot FILE u ((abs(x0-$1)<DeltaX) & (abs(y0-$2)<DeltaY) & (abs(z0-$3)<DeltaZ) & ($4==Error)? 1 : 0):(1) smooth frequency
    unset table
    set table $SuccessOccurrences
        plot FILE u ((abs(x0-$1)<DeltaX) & (abs(y0-$2)<DeltaY) & (abs(z0-$3)<DeltaZ) & ($4==Success) ? 1 : 0):(1) smooth frequency
    unset table
    # extract the number from $Occurrences which will be used to color the datapoint
    set table $ErrorDummy
        plot $ErrorOccurrences u (c0=$2,0):($0) every ::1::1 with table
    unset table
    ColCE[i] = c0
    set table $SuccessDummy
        plot $SuccessOccurrences u (c0=$2,0):($0) every ::1::1 with table
    unset table
    ColCS[i] = c0
}

# put the arrays into a dataset again
set print $Data
do for [i=1:RowCount] {
    print sprintf("%g\t%g\t%g\t%g\t%g\t%g",ColX[i],ColY[i],ColZ[i],ColR[i],ColCE[i],ColCS[i])
}
set print

stats $Data u 5:6 nooutput
CEmax = STATS_max_x
CSmax = STATS_max_y
print CEmax, CSmax

TimeEnd = time(0.0)
print sprintf("Duration: %.3f sec",TimeEnd-TimeStart)

set print "data_all_color.dat"
    print $Data
set print

set palette defined (-1 "red", 0 "white", 1 "green")
splot $Data u 1:2:3:($4==1? -$5/CEmax : $6/CSmax) w p ps 0.5 pt 7 lc palette z notitle
### end of code

一旦计算了出现次数,只需绘制新的数据文件并使用调色板即可。

### 3D density plot
reset session

FILE = "data_all_color.dat"

stats FILE u 5:6 nooutput  # get maxium count from Error and Success
CEmax = STATS_max_x
CSmax = STATS_max_y
print CEmax, CSmax

set ztics 0.2
set view 50,70
set palette defined (-1 "red", 0 "white", 1 "green")

splot FILE u 1:2:3:($4==1? -$5/CEmax : $6/CSmax) w p ps 0.2 pt 7 lc palette z notitle
### end of code

例如您的数据:

enter image description here

添加: 当前,$5列和$6列分别包含一定量的Error和Success的绝对出现次数。如果您想要一个错误概率(但是我不是统计学家),但是我的猜测是,您必须将错误$5的发生次数除以该卷中事件$5+$6的总数。 / p>

splot FILE u 1:2:3:($5/($5+$6)) w p ps 0.2 pt 7 lc palette z notitle

另一个示例的调色板是 set palette rgb 33,13,10 通常,对于调色板,请咨询help palette,您会发现很多细节。