我有一个像这样的数据集
ID,DATE,NO,MONTH
1,24 / 04 / 2019,6,2019 / 09
1,24 / 04 / 2019,7,2019 / 09
我还有几个月要在DATE栏上做广告。 例如我想在我的新列中看到2019年4月24日加上6个月-> 2019/10
错误是:
TypeError:只能将str(而不是“ int”)连接到str
import pandas as pd
dataset = pd.read_csv("denemedf.txt", delimiter=",")
print(dataset['DATE'])
dataset['ADDED'] = 0 #new column
data_to_Array = np.asarray(dataset['DATE'])
#print(data_to_Array)
numbers = [3,6]
for i in range(len(data_to_Array)):
added_value = data_to_Array[i] + numbers[i]
dataset['ADDED'][i] = added_value
from datetime import datetime
print ( dataset['ADDED'].strftime("%Y/%m") )
我如何在数据集['ADDED']中看到这个结果像今年/月一样?
我希望我能清楚地解释我想做什么。
预期结果如下:
ID,日期,否,月,已添加
1,24 / 04 / 2019,6,2019 / 09,2019 / 7
1,24 / 04 / 2019,7,2019 / 09,2019 / 10
答案 0 :(得分:1)
您首先应使用此function将DATE列字符串转换为datetime。 然后,您可以使用lambda函数通过pandas DateOffset创建新列。
dataset = pd.DataFrame({'DATE': ['24/04/2019', '24/04/2019'], 'NO': [6,7]})
dataset['DATE'] = dataset.apply(lambda x: datetime.datetime.strptime(x['DATE'], '%d/%m/%Y'), axis=1)
dataset['ADDED'] = dataset.apply(lambda x: x['DATE'] + pd.DateOffset(months=x['NO']), axis=1)
答案 1 :(得分:1)
使用Series.dt.to_period
之前的月份,因此可以添加列表或列中的天数,最后使用strftime
将输出转换为字符串:
"data": [
{
"id": "m_mid.$cAAAAAB3Zz_JwhPe3PFqU7JtwhKkY",
"created_time": "2019-04-25T08:52:43+0000"
},
{
"id": "m_mid.$cAAAAAB3Zz_JwhOZDsVqU6D6aTMok",
"created_time": "2019-04-25T08:33:40+0000"
},
{
"id": "m_mid.$cAAAAAB3Zz_JwhOIeqVqU5zVO0W_t",
"created_time": "2019-04-25T08:29:08+0000"
},
{
"id": "m_mid.$cAAAAAB3Zz_JwhOGJq1qU5xAa27DB",
"created_time": "2019-04-25T08:28:30+0000"
},
{
"id": "m_mid.$cAAAAAB3Zz_JwhOF-BlqU5wyRZs39",
"created_time": "2019-04-25T08:28:27+0000"
}
],
"paging": {
"cursors": {
"before": "QVFIUjBadUdDcHV6SWRrQkhpQy1iUURYa3lKZADRGR3ZA1RDRIOEE2LWp3aERiRXZAPbWxORFBKOWRWdXBpOWQySWx3TjdxSHpsQm0tRmpNNTc1dnBfV1JYNDFRWmtLbC1QSW5jVkk4a0NUNUF1RUNuemhxUUYzWkllVE9tWkM0Y2tpY1A4",
"after": "QVFIUnRlc3VPUEdLM2FQdWxob1Y0YWxhdWFrMGQwWUxHSVZAwbmlaZA015RlUwLWRWTHhsdjRnNi03MTl3eWJvMUxiNXBaUFd4bVVBWkhpbmFCYkNLdzR5YlJVZA3YyT1RMVnJIY2JLbXVyTjNaN1pBaWVFeEkta0NLazljUHc5WGhEQlo0"
}
}
}
如果需要添加numbers = [3,6]
s = pd.to_datetime(df['DATE']).dt.to_period('m')
df['ADDED'] = (s + np.array(numbers)).dt.strftime("%Y/%m")
print (df)
ID DATE NO MONTH ADDED
0 1 24/04/2019 6 2019/09 2019/07
1 1 24/04/2019 7 2019/09 2019/10
列:
NO
答案 2 :(得分:0)
看起来像您需要的
import pandas as pd
numbers = [3,6]
df = pd.read_csv(filename, parse_dates=["DATE"])
df["ADDED"] = [(d + pd.DateOffset(months=mon)) for d, mon in zip(df["DATE"], numbers)]
df["ADDED"] = df["ADDED"].dt.strftime("%Y/%m")
print(df)
输出:
ID DATE NO MONTH ADDED
0 1 2019-04-24 6 2019/09 2019/07
1 1 2019-04-24 7 2019/09 2019/10