根据年份条件添加和减去值Python

时间:2018-11-29 22:10:29

标签: python python-3.x pandas python-2.7 dataframe

我有两个具有日期的数据框。数据框为每个 Type 和每个 State 重复了日期,因为它是一个累积的求和框,如下所示:

Date          State     Type      Value
2010-01-01    AK        NUC       10
2010-02-01    AK        NUC       10
2010-03-01    AK        NUC       10
.
.
2010-01-01    CO        NUC       2
2010-02-01    CO        NUC       2
.
.
2010-01-01    AK        WND       20
2010-02-01    AK        WND       21
.
.
2018-08-01   .......

我需要做的是取第二个数据框,并根据“运行日期” 添加,将其添加到每个“类型” “状态” ,然后根据“退休日期” (相对于原始的“日期” 减去 。第二个数据帧如下:

Operating Date   Retirement Date   Type    State       Value
2010-02-01       2010-04-01        NUC     AK          1
2011-02-01       2014-02-01        NUC     AK          2
2011-03-01       2016-03-01        NUC     AK          10
.
.

.
2018-08-01   .......

例如,在 AK 上,输出将像这样添加和减去:

if AK(Date) == AK(Operating Date):
      AK(Value, Date) = AK(Value, Date) + AK(Value, Operating Date)

elif AK(Date) == AK(Retirement Date):
      AK(Value, Date) = AK(Value, Date) - AK(Value, Retirement Date)
else:
      continue

实际的输出数据帧(仅用于AK'NUC')将是:

Date          State     Type      Value
2010-01-01    AK        NUC       10
2010-02-01    AK        NUC       11
2010-03-01    AK        NUC       11
2010-04-01    AK        NUC       10
.
.
2011-01-01    AK        NUC       10
2011-02-01    AK        NUC       12
2011-03-01    AK        NUC       22
2011-04-01    AK        NUC       22
.
.
2016-01-01    AK        NUC       22
2010-02-01    AK        NUC       22
2010-03-01    AK        NUC       12
2010-04-01    AK        NUC       12
.
.

我该如何进行此类操作?

1 个答案:

答案 0 :(得分:1)

下面的代码中使用的主要DataFrame

df

Date        State   Type    Value
2010-01-01  AK      NUC     10
2010-02-01  AK      NUC     10
2010-03-01  AK      NUC     10
2010-01-01  CO      NUC     2
2010-02-01  CO      NUC     2
2010-01-01  AK      WND     20
2010-02-01  AK      WND     21

您要添加到主更改,请注意,我用_

替换了空格
delta

Operating_Date  Retirement_Date Type    State   Value
2010-02-01      2010-04-01      NUC     AK      1
2011-02-01      2014-02-01      NUC     AK      2
2011-03-01      2016-03-01      NUC     AK      10

攻击的计划是使用一个日期列,为此,我们需要将退休日期和工作日期合并到一列中,在使用退休日期时给该值一个负数,并为营业日期

#We first make a copy of the delta, we will call these cancellations and use the 
#Retirement_Date and the value in negative
cx = delta.copy()
cx['Date']=cx['Retirement_Date']
cx.drop(['Operating_Date','Retirement_Date'],axis=1,inplace=True)
cx['Value'] *=-1

#In the original delta we assign operating date as the date value
delta['Date'] = delta['Operating_Date']
delta.drop(['Operating_Date','Retirement_Date'],axis=1,inplace=True)

#We then append the cancellations to the main delta frame and rename the values 
#column to delta
delta = delta.append(cx)
delta.rename(columns={'Value':'Delta'},inplace=True)

我们现在有了一个带有一个日期列的数据框,其中包含我们要跟踪的每个日期的所有正向和负向变化

delta

Type    State   Delta   Date
NUC     AK      1       2010-02-01
NUC     AK      2       2011-02-01
NUC     AK      10      2011-03-01
NUC     AK      -1      2010-04-01
NUC     AK      -2      2014-02-01
NUC     AK      -10     2016-03-01

现在我们要做的就是将更改的累积值添加到主数据框

#we start by merging the data frames, as the column names are the same and we want to merge on all of them we just specify that it's an outer join
df = df.merge(delta,how='outer')
#if there are any new dates in the delta that aren't in the main dataframe we want to bring forth our cumulative sum
#but first we need to make sure we sort by date so the cumulative sum works
df.sort_values(['Type','State','Date'],inplace=True)

df['Value'] = df.groupby(['State','Type'])['Value'].ffill()

#for the dates where we have no changes we fill with zeros
df['Delta'].fillna(0,inplace=True)

#we can now add the cumilative sum of the delta to the values column

df['Value'] +=df.groupby(['State','Type'])['Delta'].cumsum().astype(int)

#and lastly we can remove the delta column again and we're done
del df['Delta']

最终的数据帧,希望是您所追求的

df

Date        State   Type    Value
2010-01-01  AK      NUC     10
2010-02-01  AK      NUC     11
2010-03-01  AK      NUC     11
2010-04-01  AK      NUC     10
2011-02-01  AK      NUC     12
2011-03-01  AK      NUC     22
2014-02-01  AK      NUC     20
2016-03-01  AK      NUC     10
2010-01-01  CO      NUC     2
2010-02-01  CO      NUC     2
2010-01-01  AK      WND     20
2010-02-01  AK      WND     21