我有一个像customer_name,current_date,current_day_count
这样的pandas数据帧(透视)+----------+--------------+-------------------+
| customer | current_date | current_day_count |
+----------+--------------+-------------------+
| Mark | 2018_02_06 | 15 |
| | 2018_02_09 | 42 |
| | 2018_02_12 | 33 |
| | 2018_02_21 | 82 |
| | 2018_02_27 | 72 |
| Bob | 2018_02_02 | 76 |
| | 2018_02_23 | 11 |
| | 2018_03_04 | 59 |
| | 2018_03_13 | 68 |
| Shawn | 2018_02_11 | 71 |
| | 2018_02_15 | 39 |
| | 2018_02_18 | 65 |
| | 2018_02_24 | 38 |
+----------+--------------+-------------------+
现在,我想为每个客户添加另一个previous_day_counts
的新列,但客户前一天的第一天值应为0 customer
,current_date
,current_day_count
,previous_day_count
(第一天的值为0)
+----------+--------------+-------------------+--------------------+
| customer | current_date | current_day_count | previous_day_count |
+----------+--------------+-------------------+--------------------+
| Mark | 2018_02_06 | 15 | 0 |
| | 2018_02_09 | 42 | 33 |
| | 2018_02_12 | 33 | 82 |
| | 2018_02_21 | 82 | 72 |
| | 2018_02_27 | 72 | 0 |
| Bob | 2018_02_02 | 76 | 0 |
| | 2018_02_23 | 11 | 59 |
| | 2018_03_04 | 59 | 68 |
| | 2018_03_13 | 68 | 0 |
| Shawn | 2018_02_11 | 71 | 0 |
| | 2018_02_15 | 39 | 65 |
| | 2018_02_18 | 65 | 38 |
| | 2018_02_24 | 38 | 0 |
+----------+--------------+-------------------+--------------------+
答案 0 :(得分:1)
试试这个:
import pandas as pd
import numpy as np
df = pd.DataFrame({'name': ['Mark','Mark','Mark','Mark','Bob','Bob','Bob','Bob'], 'current_day_count': [18,28,29,10,19,92,7,43]})
df['previous_day_count'] = df.groupby('name')['current_day_count'].shift(-1)
df.loc[df.groupby('name',as_index=False).head(1).index,'previous_day_count'] = np.nan
df['previous_day_count'].fillna(0, inplace=True)