我想在dataFrame中创建一列,这将是另外两个结果
在下面的示例中,创建了两个dataFrame:df1和df2。
然后创建了第三个dataFrame,它是前两个的交汇处。在此df3中,“日期”列已更改为dateTime类型。
此后,创建了“ DateMonth”列,其月份是从“ Dates”列中提取的。
# df1 and df2:
id_sales = [1, 2, 3, 4, 5, 6]
col_names = ['Id', 'parrotId', 'Dates']
df1 = pd.DataFrame(columns = col_names)
df1.Id = id_sales
df1.parrotId = [1, 2, 3, 1, 2, 3]
df1.Dates = ['2012-12-25', '2012-08-20', '2013-07-23', '2014-01-14', '2016-02-21', '2015-10-31']
col_names2 = ['parrotId', 'months']
df2 = pd.DataFrame(columns = col_names2)
df2.parrotId = parrot_id
df2.months = [0, ('Fev, Mar, Apr'), 0]
# df3
df3 = pd.merge(df1, df2, on = 'parrotId')
df3.Dates = pd.to_datetime(df3.Dates)
df3['DateMonth'] = df3.Dates.dt.month
在此df3中,我需要一个新列,如果“ months”列中存在“ DateMonth”列的月份,则该列的值为1。
我的困难在于,在“月份”列中,或者该值为零,或者该值是月份列表。
如何获得此结果?
答案 0 :(得分:1)
尝试以下解决方案:
import pandas as pd
# define function for df.apply
def matched(row):
if type(row['months'])==str:
# for the case ('Feb, Mar, Apr') - get numerical representation of month from your string and return True if the 'Dates' value matches with some list item
return row['Dates'].month in [datetime.strptime(mon.strip(), '%b').month for mon in row['months'].split(',')]
else:
# for numbers - return True if months match
return row['Dates'].month==row['months']
# df1 and df2:
id_sales = [1, 2, 3, 4, 5, 6]
col_names = ['Id', 'parrotId', 'Dates']
df1 = pd.DataFrame(columns = col_names)
df1.Id = id_sales
df1.parrotId = [1, 2, 3, 1, 2, 3]
df1.Dates = ['2012-12-25', '2012-08-20', '2013-07-23', '2014-01-14', '2016-02-21', '2015-10-31']
col_names2 = ['parrotId', 'months']
df2 = pd.DataFrame(columns = col_names2)
df2.parrotId = [1, 2, 3]
df2.months = [12, ('Feb, Mar, Apr'), 0]
df3 = pd.merge(df1, df2, on = 'parrotId')
df3.Dates = pd.to_datetime(df3.Dates)
# use apply to run the function on each row, astype converts boolean to int (0/1)
df3['DateMonth'] = df3.apply(matched, axis=1).astype(int)
df3
Output:
Id parrotId Dates months DateMonth
0 1 1 2012-12-25 12 1
1 4 1 2014-01-14 12 0
2 2 2 2012-08-20 Feb, Mar, Apr 0
3 5 2 2016-02-21 Feb, Mar, Apr 1
4 3 3 2013-07-23 0 0
5 6 3 2015-10-31 0 0