保证金=数据透视表中为True False结果

时间:2019-01-10 14:06:59

标签: python sum pivot-table

我有下面的代码,但是参数margin = True在最后一列中给出了总计的错误结果。如您所见,它应该给出19的结果,但是它给出的结果是其列的所有值均为false。 to_send列是通过将send的接收数减去来实现的。

import pandas as pd
df2 = pd.read_csv("https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1",encoding="latin-1")


df2['received'] = pd.to_datetime(df2['received'])
df2['sent'] = pd.to_datetime(df2['sent'])
df2['to_send']= df2['received']-df2['sent']


pvt_all = df2.dropna(axis=0, how='all', subset=['received', 'sent', 'to_send'])\
    .pivot_table(index=['site'], values=['received','sent', 'to_send'],\
    aggfunc='count', margins=True, dropna=False)
pvt_all

    received    sent    to_send
site            
2   32.0    27.0    27.0
3   20.0    17.0    17.0
4   33.0    31.0    31.0
5   40.0    31.0    31.0
All 125.0   106.0   106.0

我也以下面的另一种方式做过,但结果却有更多的错误。

import pandas as pd
df=pd.read_csv("https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1", encoding="latin-1")
pvt_received=df.pivot_table(index=['site'], values = ['received','sent'], aggfunc = {  'received' : 'count' ,'sent': 'count'}, fill_value=0, margins=True) 
pvt_received['to_send']=pvt_received['received']-pvt_received['sent']
column_order = ['received', 'sent','to_send']
pvt_received_ordered = pvt_received.reindex_axis(column_order, axis=1)
pvt_received_ordered.to_csv("test_pivot.csv")
table_to_send = pd.read_csv('test_pivot.csv', encoding='latin-1')
table_to_send.rename(columns={'site':'Site','received':'Date Received','sent':'Date Sent','to_send':'Date To Send'}, inplace=True)
table_to_send.set_index('Site', inplace=True)
table_to_send

Date Received   Date Sent   Date To Send
Site            
2   32.0    27.0    5.0
3   20.0    17.0    3.0
4   33.0    31.0    2.0
5   40.0    31.0    9.0
All 106.0   106.0   0.0

1 个答案:

答案 0 :(得分:0)

import pandas as pd
df2 = pd.read_csv("https://www.dropbox.com/s/90y07129zn351z9/test_data.csv?dl=1",encoding="latin-1")

df2['received'] = pd.to_datetime(df2['received'])
df2['sent'] = pd.to_datetime(df2['sent'])
#Removed operation that was before building pivot table and put it after pivot table is done!

pvt_all = df2.dropna(axis=0, how='all', subset=['received', 'sent'])\
    .pivot_table(index=['site'], values=['received','sent'],\
    aggfunc='count', margins=True, dropna=False)
pvt_all['to_send']= pvt_all['received']-pvt_all['sent'] #Adding operation after making pivot table
pvt_all=pvt_all[['received','sent','to_send']] #Optional, to order columns as desired
pvt_all

received    sent    to_send
site            
2   32.0    27.0    5.0
3   20.0    17.0    3.0
4   33.0    31.0    2.0
5   40.0    31.0    9.0
All 125.0   106.0   19.0