将带有月份和年份的列添加到熊猫中的新日期

时间:2021-05-10 01:08:48

标签: python pandas

我有这个数据集:

df = pd.DataFrame()
df['year'] = [2011,2011,2011,2011,2011,2011,2011,2011,2011,2011,2011,2011]
df['month'] = [1,2,3,4,5,6,1,2,3,4,5,6]
df['after'] = [0,0,0,1,1,1,0,0,0,1,1,1]
df['campaign'] = [0,0,0,0,0,0,1,1,1,1,1,1]
df['sales'] = [10000,11000,12000,10500,10000,9500,7000,8000,5000,6000,6000,7000]

而且我想要一个新的列日期,它将年和月组合成年-月日期。我试过了:

df['my_month'] = df['year']*100 + df['month'] + 1

但我不知道下一步该做什么。任何帮助将不胜感激。

2 个答案:

答案 0 :(得分:2)

如果我们需要月份的开始日期,那么

df['date'] = pd.to_datetime(df.year.astype(str) + '-' + df.month.astype(str))

样本输出

    year    month   after   campaign    sales   date
0   2011    1       0       0           10000   2011-01-01
1   2011    2       0       0           11000   2011-02-01
2   2011    3       0       0           12000   2011-03-01
3   2011    4       1       0           10500   2011-04-01

根据评论编辑

需要 year and month 格式时

pd.to_datetime(df.year.astype(str) + '-' + df.month.astype(str)).dt.to_period('M')

示例输出

    year    month   after   campaign    sales   date
0   2011    1       0       0           10000   2011-01
1   2011    2       0       0           11000   2011-02
2   2011    3       0       0           12000   2011-03
3   2011    4       1       0           10500   2011-04

答案 1 :(得分:1)

import pandas as pd
from datetime import date

def get_date(year, month):
    return date(year, month, 1)

def create_dataframe():
    df = pd.DataFrame()
    df['year'] = [2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011]
    df['month'] = [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
    df['after'] = [0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1]
    df['campaign'] = [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
    df['sales'] = [10000, 11000, 12000, 10500, 10000, 9500, 7000, 8000, 5000, 6000, 6000, 7000]

    df['my_month'] = df.apply(lambda x: get_date(x.year, x.month), axis=1)

    print(df.to_string())
    
if __name__ == '__main__':
    create_dataframe()

输出

   year  month  after  campaign  sales    my_month
0   2011      1      0         0  10000  2011-01-01
1   2011      2      0         0  11000  2011-02-01
2   2011      3      0         0  12000  2011-03-01
3   2011      4      1         0  10500  2011-04-01
4   2011      5      1         0  10000  2011-05-01
5   2011      6      1         0   9500  2011-06-01
6   2011      1      0         1   7000  2011-01-01
7   2011      2      0         1   8000  2011-02-01
8   2011      3      0         1   5000  2011-03-01
9   2011      4      1         1   6000  2011-04-01
10  2011      5      1         1   6000  2011-05-01
11  2011      6      1         1   7000  2011-06-01