Question

我有一个分组的pandas数据框，按3层分组：日期，城市，邻居，然后是“差距”。

gap列保存了我尝试取消堆栈的值。

间隙栏中的箱子包括：0,0.5 - 3,3.5至5,5.5至7等。

我想要取消堆叠数据，以便每个邻居都能看到每个差距的计数。

是否可以在保留区域，城市和日期组的同时拆分间隙值？

此处的最终目标是在每个时间点为每个城市设置一个条形图，其中每个条形图显示邻域的堆积间隙。

当我尝试将该函数用作unstack（'gap'）时，我得到一个关键错误，上面写着“找不到级别差距”

这是让我来到这里的代码：

minG = tFrame.groupby(['Date','City','Neighborhood','ID']) # there are multiple gap values for each ID

grouped_gap = minG['GAP'] # the series of gaps for each ID

groupedMin = grouped_gap.agg([('Minimum', 'min')]) # I need the minimum gap value for each ID

groupedMin = groupedMin.replace(-1, 0) # the datasource had -1 gap values

label = ['0', '0.5 to 3', '3 to 5', '5 to 7', '7 to 9', '9 to 12', '12+'] # sets the label for each desired bin

groupedMin['gaps'] = pd.cut(groupedMin['Minimum'], bins = [-1, 0.5, 3, 5, 7, 9, 12, 48], labels = label) # places each ID in a bucket, based on the labels

对于从这里获得这些条形图的任何帮助表示赞赏。

编辑：

这就是我所看到的：

此第一张图片显示最小列，这是每辆车的最小间隙值 http://i58.tinypic.com/2co2zja.jpg
第二张图显示了新的列间隙，其中每个值的最小值都已被删除： http://i60.tinypic.com/1j47bm.jpg

使用示例代码和数据框编辑＃2：

from pandas import Series, DataFrame

bFrame = DataFrame([["672,059,124","Central Business District","Baltimore","6/1/2013 13:00",4],
                   ["672,059,144","Central Business District","Baltimore","6/1/2013 13:00",1], 
                   ["673,928,993","Goucher/Towson (Baltimore County)","Baltimore","6/1/2013 13:00",-1],
                   ["647,380,667","Goucher/Towson (Baltimore County)","Baltimore","6/1/2013 13:00",4], 
                   ["801,833,082","Brookline","Boston","6/1/2013 13:00",22], 
                   ["801,833,082","Brookline","Boston","6/1/2013 13:00",24],
                   ["821,833,082","Brookline","Boston","6/1/2013 13:00",5],
                   ["956,264,933","Financial District","Boston","6/1/2013 13:00",-1],
                   ["956,264,933","Financial District","Boston","6/1/2013 13:00",2]],
                   columns=["ID","Neighborhood","City","Date","GAP"])
minGap = bFrame.groupby(['Date','City','Neighborhood','ID']) # there are multiple gap values for each ID

grouped_g = minGap['GAP'] # the series of gaps for each ID

groupedMini = grouped_g.agg([('Minimum', 'min')]) # I need the minimum gap value for each ID

groupedMini = groupedMini.replace(-1, 0) # the datasource had -1 gap values

lab = ['0', '0.5 to 3', '3 to 5', '5 to 7', '7 to 9', '9 to 12', '12+'] # sets the label for each desired bin

groupedMini['gaps'] = pd.cut(groupedMini['Minimum'], bins = [-1, 0.5, 3, 5, 7, 9, 12, 48], labels = lab) # places each ID in a bucket, based on the labels

取消分组数据框

0 个答案: