我有一个按如下方式裁剪的数据框:
def binning(col, cut_points, labels = None):
'''
From https://www.analyticsvidhya.com/blog/2016/01/12-pandas-techniques-python-data-manipulation/
'''
# Define min and max values:
minval = col.min()
maxval = col.max()
# Create list by adding min and max to cut_points
break_points = [minval] + cut_points + [maxval]
# If no labels provided, use default labels 0 ... (n-1)
if not labels:
labels = range(len(cut_points)+1)
# Binning using cut function of pandas
colBin = pd.cut(col,bins=break_points,labels=labels,include_lowest=True, duplicates = 'drop')
return colBin
cut_points = [0.5,3.5,4.5]
labels = ["z<0.5","0.5<=z<3.5","3.5<z<=4.5","z>4.5"]
sources["z_bin"] = binning(sources["z"], cut_points, labels)
print(pd.value_counts(sources["z_bin"], sort=False))
,我想将每个bin传递给我编写的函数,以绘制散点图。我知道pandas
具有绘图功能和matplotlib
的包装器,但是我想尽可能使用我的自定义函数,以使格式与其他图形保持一致。我的自定义函数如下:
plotSelected(x, y, name_for_y_series, ...a couple of other arguments)
那么我有什么方法可以针对合并的x值绘制y系列?像
plotSelected(x_binned, y, name_for_y_series, ...a couple of other arguments)
我不知道pandas
是如何组织垃圾箱的。他们是列表,元组还是其他?