我要在DataFrame上执行以下任务:
我尝试了此python 3代码。
def get_year(x):
return x.split(".")[-1]
def get_month(x):
return x.split(".")[-2]
transactions['year'] = transactions['date'].map(get_year)
transactions['month'] = transactions['date'].map(get_month)
transactions['item_cnt_day'] = transactions['item_cnt_day'].replace(-1.0, 0)
transactions["Revenue"] = transactions["item_price"]*transactions["item_cnt_day"]
sort = transactions[(transactions["year"] == 2014) & (transactions["month"] == 9)]
max(sort.groupby(transactions["Revenue"]).sum())
date date_block_num shop_id item_id item_price item_cnt_day year month Revenue
02.01.2013 0 59 22154 999.00 1.0 2013 01 999.00
03.01.2013 0 25 2552 899.00 1.0 2013 01 899.00
05.01.2013 0 25 2552 899.00 0.0 2013 01 0.00
06.01.2013 0 25 2554 1709.05 1.0 2013 01 1709.05
15.01.2013 0 25 2555 1099.00 1.0 2013 01 1099.00
答案 0 :(得分:1)
您可以使用:
#changed sample data for Septmber 2014
print (transactions)
date date_block_num shop_id item_id item_price item_cnt_day
0 02.01.2013 0 59 22154 999.00 1.0
1 03.01.2013 0 25 2552 899.00 1.0
2 05.09.2014 0 25 2552 899.00 0.0
3 06.09.2014 0 25 2554 1709.05 1.0
4 15.09.2014 0 26 2555 1099.00 1.0
首先将列date
转换为日期时间并提取年份和月份:
transactions['date'] = pd.to_datetime(transactions['date'], dayfirst=True)
transactions['year'] = transactions['date'].dt.year
transactions['month'] = transactions['date'].dt.month
transactions['item_cnt_day'] = transactions['item_cnt_day'].replace(-1.0, 0)
transactions["Revenue"] = transactions["item_price"]*transactions["item_cnt_day"]
print (transactions)
date date_block_num shop_id item_id item_price item_cnt_day \
0 2013-01-02 0 59 22154 999.00 1.0
1 2013-01-03 0 25 2552 899.00 1.0
2 2014-09-05 0 25 2552 899.00 0.0
3 2014-09-06 0 25 2554 1709.05 1.0
4 2014-09-15 0 26 2555 1099.00 1.0
year month Revenue
0 2013 1 999.00
1 2013 1 899.00
2 2014 9 0.00
3 2014 9 1709.05
4 2014 9 1099.00
sort = transactions[(transactions["year"] == 2014) & (transactions["month"] == 9)]
print (sort)
date date_block_num shop_id item_id item_price item_cnt_day \
2 2014-09-05 0 25 2552 899.00 0.0
3 2014-09-06 0 25 2554 1709.05 1.0
4 2014-09-15 0 26 2555 1099.00 1.0
year month Revenue
2 2014 9 0.00
3 2014 9 1709.05
4 2014 9 1099.00
按sum
列汇总shop_id
:
out1 = sort.groupby('shop_id', as_index=False)['Revenue'].sum()
print (out1)
shop_id Revenue
0 25 1709.05
1 26 1099.00
通过Revenue
中的out1
的最大值来购物:
out2 = out1.set_index('shop_id')['Revenue'].idxmax()
print (out2)
25