我有一个df,其中包含每个时间段的JIRA票证状态汇总,其中包含“开”,“关”和“其他”的计数。我想看看随着时间的流逝,门票数量的增加。
period Status Counts
No. 1 Apr 06 2019 to Apr 12 2019 CLOSE 1026
No. 1 Apr 06 2019 to Apr 12 2019 OPEN 2914
No. 1 Apr 06 2019 to Apr 12 2019 OTHER 264
No. 2 Mar 30 2019 to Apr 05 2019 CLOSE 1307
No. 2 Mar 30 2019 to Apr 05 2019 OPEN 2212
No. 2 Mar 30 2019 to Apr 05 2019 OTHER 256
在第1期间,OPEN状态的计数从2212(第2时期)增加到 2914,因此添加了第1期的702凭单。我如何添加和显示额外的colmun。
period Status Counts Added
No. 1 Apr 06 2019 to Apr 12 2019 CLOSE 1026 702 (2914-2212)
No. 1 Apr 06 2019 to Apr 12 2019 OPEN 2914 702
No. 1 Apr 06 2019 to Apr 12 2019 OTHER 264 702
No. 2 Mar 30 2019 to Apr 05 2019 CLOSE 1307 (2212 minus xxx)
No. 2 Mar 30 2019 to Apr 05 2019 OPEN 2212 (2212 minus xxx)
No. 2 Mar 30 2019 to Apr 05 2019 OTHER 256 (2212 minus xxx)
答案 0 :(得分:2)
您可以在OPEN
中找到差异,然后使用transform('first')
将这些值重新拟合到框架中。
u = df.assign(Added=df.loc[df.Status.eq('OPEN'), 'Counts'].diff(-1))
u.assign(Added=u.groupby('period')['Added'].transform('first'))
period Status Counts Added
0 No. 1 Apr 06 2019 to Apr 12 2019 CLOSE 1026 702.0
1 No. 1 Apr 06 2019 to Apr 12 2019 OPEN 2914 702.0
2 No. 1 Apr 06 2019 to Apr 12 2019 OTHER 264 702.0
3 No. 2 Mar 30 2019 to Apr 05 2019 CLOSE 1307 NaN
4 No. 2 Mar 30 2019 to Apr 05 2019 OPEN 2212 NaN
5 No. 2 Mar 30 2019 to Apr 05 2019 OTHER 256 NaN
答案 1 :(得分:0)
public object myObject { get; set; }
使用diff()函数并使用向后和向前填充函数来填充NA。
答案 2 :(得分:0)
从定义要在下面应用的功能开始
df
然后,通过应用此功能:
import os
os.startfile('filename.csv')
您将获得一个带有def fn(src):
return src.query("Status == 'OPEN'").Counts
列的DataFrame。
最后一步是合并两个DataFrame:
df2 = df.groupby('period').apply(fn).diff(-1)\
.fillna(0, downcast='infer')\
.reset_index(level=1, drop=True).to_frame('Added')