我有一个这样的数据框:
s = RfcConnection(sysrfc="QE2")
result = s.delete_to(lgnum="220", tanum="9592250", cancl="X", commit_work="X")
def delete_to(self, lgnum=None, tanum=None, solex=None, cancl=None, subst=None, qname=username, update_task=None, commit_work=None, t_ltap_cancl=None):
return_msg = None
assert (lgnum is not None and tanum is not None), "Warehouse number as lgnum and Transfer Order as tanum are required for function delete_to"
if cancl is None:
cancl = "X"
try:
if t_ltap_cancl is None:
return_msg = self.conn.call("L_TO_CANCEL",
I_LGNUM=lgnum,
I_TANUM=pad(tanum, 10, "0"),
I_CANCL=cancl)
elif t_ltap_cancl is not None:
return_msg = self.conn.call("L_TO_CANCEL",
I_LGNUM=lgnum,
I_TANUM=pad(tanum, 10, "0"),
T_LTAP_CANCL=t_ltap_cancl)
except pyrfc._exception.ABAPApplicationError as e:
if e.msg_class == "L3" and e.msg_number == "354":
return_msg = self.get_error_code(Language="EN", Area=e.msg_class, Message=e.msg_number)[0][0].replace("&", "{}".format(tanum))
else:
return_msg = self.get_error_code(Language="EN", Area=e.msg_class, Message=e.msg_number)
except pyrfc._exception.ABAPRuntimeError as e:
if e.msg_class == "L3" and e.msg_number == "037":
return_msg = self.get_error_code(Language="EN", Area=e.msg_class, Message=e.msg_number)[0][0].replace("&", "{}".format(e.msg_v1))
else:
return_msg = self.get_error_code(Language="EN", Area=e.msg_class, Message=e.msg_number)
except Exception as e:
return_msg = e
return return_msg
我将沿名称和日期进行累计,我的意思是,此示例的预期结果将是:
df =
name amount date
0 A 10 1
1 B 15 1
2 A 5 2
3 C 7 3
4 A 8 4
5 B 10 4
6 C 11 4
我想显示日期列所表示的时间段内的累计值,例如,对于A,其在周期1中的值为10,在2中为5,在3中为0(因为它不会出现),而在4中为8,因此在df_result中显示了累加。 C直到周期3才出现,因为它直到那个周期才值
我尝试了groupby,cumsum甚至stack的不同组合,但是我无法实现任何接近目标的方法。
答案 0 :(得分:1)
查看是否有帮助:
>>> df.groupby(by=['name','date']).sum().groupby(level=[0]).cumsum().reset_index()
name date amount
0 A 1 10
1 A 2 15
2 A 4 23
3 B 1 15
4 B 4 25
5 C 3 7
6 C 4 18
另一个回答如@Jon在评论中所述,其枢轴吸引您关闭显示的内容。
>>> df = df.pivot('date', 'name', 'amount').fillna(0).stack().groupby(level=1).cumsum().astype('int')[lambda v: v != 0].reset_index()
重命名最后一列,因为它将为零。
>>> df.rename(columns={0: 'amount'}, inplace=True)
>>> df
date name amount
0 1 A 10
1 1 B 15
2 2 A 15
3 2 B 15
4 3 A 15
5 3 B 15
6 3 C 7
7 4 A 23
8 4 B 25
9 4 C 18