我和大熊猫一起做了一组。现在我想迭代每一行。尺寸在哪里?
df = pandas.DataFrame.from_dict(
{'category': {0: 'Apps', 1: 'Apps', 2: 'Apps', 3: 'Apps', 4: 'Apps', 5: 'Apps', 6: 'Apps', 7: 'Apps', 8: 'Apps', 9: 'Apps', 10: 'Apps', 11: 'Apps', 12: 'Apps', 13: 'Apps', 14: 'Apps'}, 'country': {0: 'N/A', 1: 'Australia', 2: 'Austria', 3: 'Belgium', 4: 'Brazil', 5: 'Canada', 6: 'China', 7: 'Dominican Republic', 8: 'Finland', 9: 'Greece', 10: 'Hungary', 11: 'India', 12: 'Indonesia', 13: 'Luxembourg', 14: 'Nepal'}, 'criteria': {0: 'referrer=direct', 1: 'referrer=direct', 2: 'referrer=direct', 3: 'referrer=direct', 4: 'referrer=direct', 5: 'referrer=direct', 6: 'referrer=direct', 7: 'referrer=direct', 8: 'referrer=direct', 9: 'referrer=direct', 10: 'referrer=direct', 11: 'referrer=direct', 12: 'referrer=direct', 13: 'referrer=direct', 14: 'referrer=direct'}, 'date': {0: '2013-11-05', 1: '2013-11-05', 2: '2013-11-05', 3: '2013-11-05', 4: '2013-11-05', 5: '2013-11-05', 6: '2013-11-05', 7: '2013-11-05', 8: '2013-11-05', 9: '2013-11-05', 10: '2013-11-05', 11: '2013-11-05', 12: '2013-11-05', 13: '2013-11-05', 14: '2013-11-05'}, 'cpc_cpm_revenue': {0: 0.001, 1: 0.01942, 2: 0.0050000000000000001, 3: 0.002, 4: 0.012200000000000001, 5: 0.020899999999999998, 6: 0.030499999999999999, 7: 0.001, 8: 0.0050000000000000001, 9: 0.019, 10: 0.012, 11: 0.017999999999999999, 12: 0.001, 13: 0.0040000000000000001, 14: 0.001}, 'impressions': {0: 1.0, 1: 12.0, 2: 1.0, 3: 2.0, 4: 14.0, 5: 17.0, 6: 31.0, 7: 1.0, 8: 5.0, 9: 19.0, 10: 12.0, 11: 18.0, 12: 1.0, 13: 1.0, 14: 1.0}, 'clicks': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0, 10: 0.0, 11: 0.0, 12: 0.0, 13: 0.0, 14: 0.0}, 'size': {0: '300x250', 1: '300x250', 2: '300x250', 3: '300x250', 4: '300x250', 5: '300x250', 6: '300x250', 7: '300x250', 8: '300x250', 9: '300x250', 10: '300x250', 11: '300x250', 12: '300x250', 13: '300x250', 14: '300x250'}}
)
df = df.groupby(by=['date','category','country','criteria','size']).sum()
print df.columns
Index([u'clicks', u'cpc_cpm_revenue', u'impressions'], dtype=object)
所以......哇......有点难过。我失踪了:
'date','category','country','criteria','size'
答案 0 :(得分:4)
你没有遗漏任何东西。您要求在五列groupby
上['date','category','country','criteria','size']
- 这就是您所拥有的。这些列现在是 indices :
>>> df.head()
clicks cpc_cpm_revenue \
date category country criteria size
2013-11-05 Apps Australia referrer=direct 300x250 0 0.01942
Austria referrer=direct 300x250 0 0.00500
Belgium referrer=direct 300x250 0 0.00200
Brazil referrer=direct 300x250 0 0.01220
Canada referrer=direct 300x250 0 0.02090
impressions
date category country criteria size
2013-11-05 Apps Australia referrer=direct 300x250 12
Austria referrer=direct 300x250 1
Belgium referrer=direct 300x250 2
Brazil referrer=direct 300x250 14
Canada referrer=direct 300x250 17
>>> df.columns
Index([clicks, cpc_cpm_revenue, impressions], dtype=object)
>>> df.index
MultiIndex
[(2013-11-05, Apps, Australia, referrer=direct, 300x250), (2013-11-05, Apps, Austria, referrer=direct, 300x250), (2013-11-05, Apps, Belgium, referrer=direct, 300x250), (2013-11-05, Apps, Brazil, referrer=direct, 300x250), (2013-11-05, Apps, Canada, referrer=direct, 300x250), (2013-11-05, Apps, China, referrer=direct, 300x250), (2013-11-05, Apps, Dominican Republic, referrer=direct, 300x250), (2013-11-05, Apps, Finland, referrer=direct, 300x250), (2013-11-05, Apps, Greece, referrer=direct, 300x250), (2013-11-05, Apps, Hungary, referrer=direct, 300x250), (2013-11-05, Apps, India, referrer=direct, 300x250), (2013-11-05, Apps, Indonesia, referrer=direct, 300x250), (2013-11-05, Apps, Luxembourg, referrer=direct, 300x250), (2013-11-05, Apps, N/A, referrer=direct, 300x250), (2013-11-05, Apps, Nepal, referrer=direct, 300x250)]
如果您想再次列出这些列,可以拨打.reset_index()
:
>>> df = df.reset_index()
>>> df.head()
date category country criteria size clicks cpc_cpm_revenue \
0 2013-11-05 Apps Australia referrer=direct 300x250 0 0.01942
1 2013-11-05 Apps Austria referrer=direct 300x250 0 0.00500
2 2013-11-05 Apps Belgium referrer=direct 300x250 0 0.00200
3 2013-11-05 Apps Brazil referrer=direct 300x250 0 0.01220
4 2013-11-05 Apps Canada referrer=direct 300x250 0 0.02090
impressions
0 12
1 1
2 2
3 14
4 17
或者,正如@Andy Hayden指出的那样,从来没有把它们作为指数:
>>> df = df.groupby(by=['date','category','country','criteria','size'], as_index=False).sum()