早上好,我有以下df:
print(df)
Date cod_id Sales Initial_stock
01/01/2017 1 5 5
01/01/2017 2 4 8
02/01/2017 1 1 5
...
由于真实数据集中存在一些错误,关于“Initial_stock”,我想为不同的 cod_id s(=产品)创建一个新列,如下:
cod_id 前一行的初始库存+初始库存的当前值 - 销售额;这样:
print(df_final)
Date cod_id Sales Initial_stock new
01/01/2017 1 5 5 0
01/01/2017 2 4 8 4
02/01/2017 1 1 5 4
...
其中最后一个值等于4的“cod_id 1”计算如下:0 + 5 - 1 = 4
答案 0 :(得分:1)
import pandas as pd
from pandas import DataFrame
d = {'cod_id': [1, 2, 1], 'Sales': [5,4,1], 'Initial_stock': [5,8,5]}#my initil data
#######show purpose#######
df = pd.DataFrame(data=d)#I print the dataframe of my initial data
print (df)
##########################
new=[]#declare a new list where I'll introduce all the new values
i=0
#I create a loop for element present in my initial list and for each subelement present I calculate the new one
while i <len(d['cod_id']):
new_value=(d['Initial_stock'][i])-(d['Sales'][i])#clculation new=initial_stock-sales
new.append(new_value)#append my new value in the new list
i+=1
#######show purpose#######
print (new)#print my new list to show that the calculation is correct
##########################
d['new']=new#add my new data to the original list
#######show purpose#######
df = pd.DataFrame(data=d)#create the data frame with my new values and print it again
print (df)
##########################