如何在熊猫框架中将特定月份的值与日期相加?

时间:2019-04-25 09:15:34

标签: python pandas date sum

我有一个像这样的数据集

  

ID,DATE,NO,MONTH

     

1,24 / 04 / 2019,6,2019 / 09

     

1,24 / 04 / 2019,7,2019 / 09

我还有几个月要在DATE栏上做广告。 例如我想在我的新列中看到2019年4月24日加上6个月-> 2019/10

错误是:

  

TypeError:只能将str(而不是“ int”)连接到str

import pandas as pd

dataset = pd.read_csv("denemedf.txt", delimiter=",")
print(dataset['DATE'])
dataset['ADDED'] = 0 #new column
data_to_Array = np.asarray(dataset['DATE'])
#print(data_to_Array)

numbers = [3,6]

for i in range(len(data_to_Array)):
    added_value = data_to_Array[i] + numbers[i]
    dataset['ADDED'][i] = added_value

from datetime import datetime
print ( dataset['ADDED'].strftime("%Y/%m") )

我如何在数据集['ADDED']中看到这个结果像今年/月一样?

我希望我能清楚地解释我想做什么。

预期结果如下:

  

ID,日期,否,月,已添加

     

1,24 / 04 / 2019,6,2019 / 09,2019 / 7

     

1,24 / 04 / 2019,7,2019 / 09,2019 / 10

3 个答案:

答案 0 :(得分:1)

您首先应使用此function将DATE列字符串转换为datetime。 然后,您可以使用lambda函数通过pandas DateOffset创建新列。

dataset = pd.DataFrame({'DATE': ['24/04/2019', '24/04/2019'], 'NO': [6,7]})
dataset['DATE'] = dataset.apply(lambda x: datetime.datetime.strptime(x['DATE'], '%d/%m/%Y'), axis=1)
dataset['ADDED'] = dataset.apply(lambda x: x['DATE'] + pd.DateOffset(months=x['NO']), axis=1)

答案 1 :(得分:1)

使用Series.dt.to_period之前的月份,因此可以添加列表或列中的天数,最后使用strftime将输出转换为字符串:

    "data": [
        {
            "id": "m_mid.$cAAAAAB3Zz_JwhPe3PFqU7JtwhKkY",
            "created_time": "2019-04-25T08:52:43+0000"
        },
        {
            "id": "m_mid.$cAAAAAB3Zz_JwhOZDsVqU6D6aTMok",
            "created_time": "2019-04-25T08:33:40+0000"
        },
        {
            "id": "m_mid.$cAAAAAB3Zz_JwhOIeqVqU5zVO0W_t",
            "created_time": "2019-04-25T08:29:08+0000"
        },
        {
            "id": "m_mid.$cAAAAAB3Zz_JwhOGJq1qU5xAa27DB",
            "created_time": "2019-04-25T08:28:30+0000"
        },
        {
            "id": "m_mid.$cAAAAAB3Zz_JwhOF-BlqU5wyRZs39",
            "created_time": "2019-04-25T08:28:27+0000"
        }
    ],
    "paging": {
        "cursors": {
            "before": "QVFIUjBadUdDcHV6SWRrQkhpQy1iUURYa3lKZADRGR3ZA1RDRIOEE2LWp3aERiRXZAPbWxORFBKOWRWdXBpOWQySWx3TjdxSHpsQm0tRmpNNTc1dnBfV1JYNDFRWmtLbC1QSW5jVkk4a0NUNUF1RUNuemhxUUYzWkllVE9tWkM0Y2tpY1A4",
            "after": "QVFIUnRlc3VPUEdLM2FQdWxob1Y0YWxhdWFrMGQwWUxHSVZAwbmlaZA015RlUwLWRWTHhsdjRnNi03MTl3eWJvMUxiNXBaUFd4bVVBWkhpbmFCYkNLdzR5YlJVZA3YyT1RMVnJIY2JLbXVyTjNaN1pBaWVFeEkta0NLazljUHc5WGhEQlo0"
        }
    }
}

如果需要添加numbers = [3,6] s = pd.to_datetime(df['DATE']).dt.to_period('m') df['ADDED'] = (s + np.array(numbers)).dt.strftime("%Y/%m") print (df) ID DATE NO MONTH ADDED 0 1 24/04/2019 6 2019/09 2019/07 1 1 24/04/2019 7 2019/09 2019/10 列:

NO

答案 2 :(得分:0)

看起来像您需要的

import pandas as pd

numbers = [3,6]
df = pd.read_csv(filename, parse_dates=["DATE"])
df["ADDED"] = [(d +  pd.DateOffset(months=mon)) for d, mon in zip(df["DATE"], numbers)]  
df["ADDED"] = df["ADDED"].dt.strftime("%Y/%m")
print(df)  

输出:

   ID       DATE  NO     MONTH    ADDED
0   1 2019-04-24   6  2019/09   2019/07
1   1 2019-04-24   7   2019/09  2019/10