熊猫-日期分组内的分箱

时间:2019-01-10 22:29:47

标签: python pandas bin

我的目标是在我的数据集的年内按大小汇总数据。我能够分别完成这两项任务(例如,按年份汇总或按bin汇总),但是在合并两者时遇到语法麻烦。

以下是我可以按年份汇总数据的方法:

size_summary = df_raw.groupby(['Year'])['Quantity'].describe()

下面是我创建垃圾箱的方式

mult = 1
bins = [5*mult, 10*mult, 25*mult, 50*mult, 100*mult]
groups = df_raw.groupby(pd.cut(df_raw['Quantity'], bins))

当我尝试在下面将两者结合时,出现错误消息。有人知道如何结合使用以达到我的目标吗?谢谢您的帮助。

groups.groupby(['Year'])['Quantity'].describe()
AttributeError: Cannot access callable attribute 'groupby' of 'DataFrameGroupBy' objects, try using the 'apply' method

编辑:按照下面的要求添加样本数据。

df_raw = pd.DataFrame(data={
    'Year': [2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014],
    'Quantity': [2.0, 3.0, 78.8, 65.7, 70.0, 61.9, 83.9, 39.7, 44.1, 14.5, 35.3, 82.2, 13.9, 66.6, 65.8, 94.8, 50.8, 17.1, 9.9, 51.1, 62.9, 63.0, 13.5, 37.6, 1.5, 70.7, 23.3, 28.1, 21.9, 60.7, 1.1, 67.2, 0.4, 81.4, 86.7, 36.2, 45.2, 50.4, 43.3]
})

所需的输出采用以下格式-为屏幕截图致歉。 enter image description here

2 个答案:

答案 0 :(得分:1)

您真的很亲近。请尝试以下一项:

mult = 1
bins = [0, 5*mult, 10*mult, 25*mult, 50*mult, 100*mult]
df_raw['bin'] = pd.cut(df_raw['Quantity'], bins)
df_raw.pivot_table(index = 'bin', columns = 'Year', aggfunc = 'count')

答案 1 :(得分:1)

作为 @Override public boolean onOptionsItemSelected(MenuItem item) { switch (item.getItemId()) { case R.id.action_save: insertTask(); // Exit activity finish(); return true; case android.R.id.home: // Navigate back to parent activity (CatalogActivity) NavUtils.navigateUpFromSameTask(this); return true; } return super.onOptionsItemSelected(item); } @Override public void onClick(View v) { switch (v.getId()){ case R.id.button1: final EditText editText = new EditText(this); AlertDialog dialog = new AlertDialog.Builder(this) .setTitle("Add new Challenge") .setMessage("Whats your Challenge") .setView(editText) .setPositiveButton("Add", new DialogInterface.OnClickListener() { @Override public void onClick(DialogInterface dialog, int which) { String task = String.valueOf(editText.getText()); insertDay(task); } }) .setNegativeButton("CANCEL", null) .create(); dialog.getWindow().getAttributes().windowAnimations=R.style.DialogAnimation; dialog.show(); } } //TEMPORARY SOLUTION public void insertDay(String task) { DbHelper mDbHelper = new DbHelper(this); SQLiteDatabase db = mDbHelper.getWritableDatabase(); ContentValues values = new ContentValues(); values.put(ChallengesEntry.DB_COLUMN_DAYS,task ); db.insertWithOnConflict(ChallengesEntry.DB_TABLE, null, values, SQLiteDatabase.CONFLICT_REPLACE); db.close(); } public void insertTask(){ String name=editChallengeName.getText().toString().trim(); DbHelper mDbHelper = new DbHelper(this); SQLiteDatabase db = mDbHelper.getWritableDatabase(); ContentValues values = new ContentValues(); values.put(ChallengesEntry.DB_COLUMN,name); db.insertWithOnConflict(ChallengesEntry.DB_TABLE, null, values, SQLiteDatabase.CONFLICT_REPLACE); db.close(); } 的替代方法,您可以按箱和年份分组,然后通过pivot_table重塑数据:

unstack
# first group by bins, then by year
groups = df_raw.groupby([pd.cut(df_raw['Quantity'], bins), 'Year'])

# compute group size, pivot into the shape you want
counts = groups.size().unstack(fill_value=0)
counts

这比您提供的示例数据上的Year 2012 2013 2014 Quantity (5, 10] 0 1 0 (10, 25] 2 3 1 (25, 50] 3 2 3 (50, 100] 7 7 5 快2.5倍。


要将分类区间索引拆分为pivot_table,请使用类似

MultiIndex
def interval_to_tuple(interval):
    return interval.left, interval.right

counts.set_index(
    counts.index.astype(object).map(interval_to_tuple).rename(['Lower', 'Upper']))

您应该能够将此结果顺利导出到Excel。